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EDITORIAL 


Build capacity for climate action 


tis clear that climate action is not on course either 
to achieve agreed-upon temperature goals or to 
protect people from increasingly severe climate im- 
pacts. The United Nations climate meeting (COP28) 
now underway is being called upon to provide a 
“course correction,’ and the Global Stocktake (GST) 
to assess progress under the Paris Agreement has 
identified the need for “systems transformations.” As 
in previous years, COP28 will undoubtedly focus on 
finance and technology development and transfer, but 
these alone will not enable adequate, effective, and 
equitable climate action. It is time to expand focus to 
the third means of implementation: capacity building, 
without which a course correction involving multiple 
systems transformations is not going to occur. 

Although definitions of capacity vary, the deceptively 
simple one of Youba Sokona [former vice chair of the 
Intergovernmental Panel on Climate Change (IPCC)] is 
useful: “not the ability to implement 
someone else’s agenda but the ability 
to set and pursue your own agenda.” 
This requires understanding the 
challenge; weighing options; design- 
ing plans; and then coordinating, 
communicating, and implementing 
them. Multiple forms of knowledge 
and other resources, including fi- 
nances and institutional arrange- 
ments, must be integrated at each step. 

Capacity building has been recognized in the global cli- 
mate policy regime for at least two decades but has seen 
limited progress. During the GST process, participants 
highlighted that capacity gaps were impeding climate 
action. Similarly, the majority of developing countries’ 
nationally determined contributions under the Paris 
Agreement called for capacity building. Meanwhile, bi- 
lateral capacity-related funding continues to be inad- 
equate and tied to donor preferences. Of the estimated 
$1.8 billion in annual philanthropic disbursements, ca- 
pacity-building-related activities accounted for only $65 
million, much of which flowed to the Global North. 

The disjuncture between what is needed and what 
has been accomplished can be explained in three 
ways: a lack of scholarship that can provide clarity 
and guidance; the complexity and contextual nature 
of capacity for climate action, which complicates 
scaling or transfer; and skepticism stemming from 
historical experiences with nonsystemic approaches 
to capacity building. None of these are insurmount- 
able barriers. But the academic community, govern- 
ments, and funders must reexamine their approach. 


finding 


continues to be 
inadequate...” 


There is increasing awareness of the knowledge 
needs for systems transformations. For example, the 
Global Sustainable Development Report highlights the 
role of science and scholarship in accelerating trans- 
formations. Similarly, climate and development lead- 
ers, such as the late Saleemul Huq, have long insisted 
on sustained support for independent, research-active 
educational institutions to generate knowledge and do- 
mestic talent. These developments are essential, but the 
challenge is broader. It requires deepening our under- 
standing of what the components of capacity are, how 
they can be supported and retained, and what a system- 
atic approach to capacity building would entail. 

Unfortunately, no global institution is focused on as- 
sessing or building capacity for climate action. Reports 
by the IPCC do not include a chapter on capacity build- 
ing, unlike the two other means of implementation, 
technology and finance. Capacity building was a theme 
within the GST, but without solid 
scholarship to draw on, the discourse 
remained too generic to provide 
meaningful guidance. In the absence 
of such guidance, capacity too often 
gets reduced to trainings or other 
one-off and ineffectual interventions, 
which reinforces skepticism about ca- 
pacity building. 

Taking this systems view high- 
lights two urgent tasks that require 
the cooperation of the academic community, govern- 
ments, and funders. One is to deepen our understand- 
ing of capacity needs across different contexts and to 
assess capacity and capacity-building for climate ac- 
tion. This assessment would need research focused 
on both the Global North and South, as opportunities 
for learning across contexts are underexplored. The 
IPCC should include a dedicated chapter, or a special 
cross-cutting report, in its next assessment cycle to 
help integrate this knowledge. Another task is the es- 
tablishment of major programmatic capacity-building 
activities, sustained by adequate funding. Country 
governments, multilateral and bilateral organizations, 
and philanthropies must increase their efforts. In par- 
ticular, capacity building must be integrated as a cru- 
cial component of commitments to adequate climate 
finance for developing countries. 

Hopefully, COP28 will heed these issues. Otherwise 
it seems unlikely that the magnitude and speed of the 
course corrections being called for will be achievable. 


-Sonja Klinsky and Ambuj Sagar 
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6G | think a lot of people ... thought: ‘Oh God. Not again. 99 | “pd 


WHO official Maria Van Kerkhove, in STAT, about fears that a surge of respiratory 
diseases sickening Chinese children, reminiscent of the emergence of COVID-19, were caused by a new 
infectious agent. Health officials in China say the diseases stem from known pathogens. 


IN BRIEF Edited by Jeffrey Brainard 


LEADERSHIP 


Argentina’s president targets science 


he election last week of libertarian Javier Milei as Argentina’s 
next president has many of the nation’s scientists fearing 
the future. Milei, who won 55.7% of the vote, has vowed to 
close or dramatically restructure the National Scientific and 
Technical Research Council (CONICET), Argentina’s main 
science funding agency, and its health and environment 
ministries. He views climate change as a “socialist hoax.” Milei has 
called CONICET, which employs nearly 12,000 researchers and 
ranks as one of South America’s top government science agencies, 
“unproductive” and pledged to “clean up what was dirty by those 
scientists who write stupid things.” Researchers are vowing to fight 
changes they say will weaken CONICET. Milei, who takes office on 
10 December, has few allies in Argentina’s legislature, and may not 


be able to persuade it to back sweeping changes. 


New Zealand to nix smoking rules 


HEALTH POLICY | New Zealand’s new 
center-right governing coalition plans to 
scrap an antismoking law that scientists 
and public health officials had praised. 
The 2022 law mandates measures slated 
to take effect in coming years, such as a 
ban on selling tobacco products to any- 
one born after January 2009, a restriction 
allowing only 600 approved retailers to 
sell tobacco, and drastic reductions in 
the quantity of nicotine allowed in these 
products. Proponents of the controls 
predicted they would help the coun- 

try achieve a smoking rate of less than 
5% by 2025. Similar measures recently 
announced by the United Kingdom are 
thought to be inspired by New Zealand. 
But last week, leaders of its incoming 
coalition government announced they 
would repeal the law and use revenue 
from taxing cigarettes to help cut other 
taxes. Health policy specialists say the 
move will likely widen the country’s 
health inequalities. 
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Disabled Ph.D.s get less pay 


Equity | Disabled U.S. residents with sci- 
ence doctorates who work in science earn 
less than their peers without a disability, a 
study has found. A research team analyzed 
National Science Foundation data about 
more than 700,000 workers who received 
a Ph.D. in science, technology, engineer- 
ing, or math (STEM) between 1973 and 
2017, including almost 60,000 disabled 
scientists. The team found that research- 
ers who reported disabilities that began 
early in life (before 25 years old) earned 
$10,580 less per year, on average, than 
scientists without disabilities; the short- 
fall rose to $14,360 for the subset working 
in academe. The gaps were absent for 
those whose disabilities began later in 
life. Previous studies of income across 
occupations that included non-STEM 
ones reported a gap between disabled and 
other workers that widens as educational 
attainment increases. The new study, 
published this week in Nature Human 
Behaviour, also found that disabled STEM 


Ph.D. recipients are underrepresented in 
top academic positions, accounting for 
less than 7% of college presidents, deans, 
and tenure-track faculty members. The 
authors call for improving pay, working 
conditions, and career opportunities for 
those with disabilities. 


Saudi ties ebb in revised ranking 


RESEARCH METRICS | The number of - 
highly cited researchers named in a 
leading database as affiliated with Saudi 
universities dropped by 30% this year, fol- 
lowing revelations that scientists working 
outside the country accepted payments for 
declaring such ties, an analysis has found. 
A report last week by the research consul- 
tancy Siris Academic examined the list of 
highly cited researchers (HCRs) released 
annually by the publishing analytics com- 
pany Clarivate. It found that Saudi Arabia 
had 76 of the world’s most cited research- 
ers, down from 109 in 2022—a decrease 
that could cause some Saudi institutions 
to fall in global rankings of universities. 
This year, news outlets had reported that 
foreign HCRs were being paid by Saudi 
institutions to tell Clarivate their primary 
academic affiliation was with them, even 
though many did not regularly work there. 
These switches were suspected of helping 
boost some Saudi universities in interna- 
tional rankings that reflect their number 
of HCRs. Recently, institutions in several 
other countries have pressured researchers 
to correct the affiliations. 


« 


Biobank yields sequence trove 


GENOmiIcs | The biomedical database UK 
Biobank this week released whole-genome 
sequences for half a million people, more 
than doubling the set of whole genomes 
already made available by the nonprofit in 
2021. The new release makes the biobank’s 
whole-genome data set the world’s larg- 
est, offering a uniquely rich resource for 
research on how human DNA underpins 
health and disease, says Eleftheria Zeggini, 
director of the Institute of Translational 
Genomics at Helmholtz Munich. “In the field 
of human genomics, sample size is queen.” 
The whole-genome sequencing effort was 
funded with more than £200 million from 
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ANIMAL PHYSIOLOGY 
Mantis is a record glider 


few species of arthropods are 
known for their ability to take to 
the air and glide. But the phylum’s 
champ for distance is the orchid 
mantis (Hymenopus coronatus, 
right), thanks to its specially adapted 
legs—the first rigid structures discov- 
ered in any animal to function as wings 
for gliding, a study has found. The 
insects, native to the tropical forests 
of Southeast Asia, may use flight to 
escape predators. The mantises can 
control their flight and travel up to 
8 meters, researchers report this 
week in Current Biology. That’s 50% 
farther than any ant can glide and 
200% farther than any spider. The 
orchid mantis’ legs sport yet another 


adaptation: Their petal shape and pink- 


and-white coloration make the insects 
resemble a moth orchid, camouflage 
that helps them ambush prey. 


the U.K. government, the Wellcome Trust, 
and four pharmaceutical companies: Amgen, 
AstraZeneca, GSK, and Johnson & Johnson. 
By agreement, the companies enjoyed exclu- 
sive access to the data for the past 9 months. 


Prizes aim to boost longevity 


FUNDING | The XPRIZE Foundation this 
week unveiled a $101 million competition 
for discoveries of therapies that restore 
functions lost to aging in older people. 

The largest individual prize, $81 million, is 
reserved for a promising therapy to elimi- 
nate the expected declines from as much as 
20 years of aging, as measured by improve- 
ments in muscle function, cognition, and 
immunity. The prizes prioritize treatments 
benefiting those 65 to 80 years old to maxi- 
mize impact and because 80 is roughly the 
average life expectancy in most developed 
countries. All the XPRIZE Healthspan prizes 
are intended to support early-stage clinical 
studies of approaches that show benefits 

in 1 year or less; XPRIZE will develop and 
validate metrics for these gains. The foun- 
dation also wants the interventions to be 
affordable worldwide. The principal funders 
of the prizes are Chip Wilson, founder of 
the Lululemon sportswear company, and 
the Hevolution Foundation, which sup- 
ports research on longevity and is financed 
primarily by Saudi Arabia. 
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Winning over evolution skeptics 


Science educator Amanda (Glaze) Townley brings both scholarship and a religious 
upbringing to her new job as executive director of the National Center for Science 
Education (NCSE), a nonprofit that defends the teaching of Darwinian evolution and 
climate change in U.S. public schools. Threats to both subjects have increased in recent 
years as part of a broader campaign by conservatives to ban certain topics from class- 
rooms. As a faculty member at Georgia Southern University, Townley helped develop 

a resource for high school science teachers that encourages them to acknowledge 

and discuss cultural or religious beliefs that could prevent students from accepting 
evolution—beliefs that were also part of her childhood in rural Alabama. A longer ver- 
sion of this interview is at https://scim.ag/TownleyQA. 


Q: Why did you decide to focus on 
evolution education? 

A: | grew up in a ministry family that 
believed in young Earth creationism. 
And when | got to high school, we were 
not taught evolution because the teach- 
ers did not agree with that. But when | 
read the part of the book we didn't cover 
in class, it made sense to me. That was 
my “aha” moment. 


Q: Why is it important to address student 
misconceptions? 

A: There are plenty of people who know 
a great deal about evolution, but because 
of conflicts with their worldview, they 
reject it outright. So just throwing more 
evidence at people doesn't really get 


to the heart of the matter. ... Knowing 
that the teacher respects their views 
makes students more receptive to 
learning about the scientific evidence 
for evolution. 


Q: What is NCSE’s role? 

A: Our goal is for all students in the 
United States to have the opportunity to 
engage with the scientific evidence [for 
evolution and climate change]. And it’s a 
big challenge because there is so much 
misinformation out there about both 
topics. The more money we're able to 
raise, the more of asplash we can 

make. Imagine what we could do if we 
had enough donor support for a boots- 
on-the-ground effort in every state? 


1 DECEMBER 2023 « VOL 382 ISSUE 6674 981 


Ancient hunter-gatherers enjoyed 
bountiful food from the forests near 
Siberia’s Amnya River. 


Oldest forts challenge views of hunter-gatherers 


8000 years ago—long before farming arrived—people in Siberia built defensive structures 


By Andrew Curry 


eople who lived in central Siberia 

thousands of years ago enjoyed a 

comfortable lifestyle despite the ar- 

ea’s cold winters. They fished abun- 

dant pike and salmonids from the 

Amnya River and hunted migrat- 
ing elk and reindeer with bone and stone- 
tipped spears. To preserve their rich stores 
of fish oil and meat, they created elabo- 
rately decorated pottery. And they built the 
world’s first known fortresses, perhaps to 
keep out aggressive neighbors. 

With room inside for dozens of people 
and dwellings sunk almost 2 meters deep 
for warmth in Siberian winters, the for- 
tresses were ringed by earthen walls sev- 
eral meters high and topped with wooden 
palisades. At some point, they were con- 
sumed by flame, a possible sign of early 
battles. And at least one set of structures 
was built startlingly early: 8000 years ago, 
2000 years before the mighty walls of Uruk 
and Babylon in the Middle East and thou- 
sands of years before agriculture reached 
some parts of Europe and Asia. 
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Reported this week in the journal Antig- 
uity, that early date and the fact that hunter- 
gatherers built the structures add to the 
growing evidence challenging the textbook 
view that permanent settlements—and walls 
to protect them—could only arise after the 
dawn of agriculture. “To many people, this 
still is not part of what hunter-gatherers are. 
... There’s still an element in archaeology 
that believes complexity develops over time,” 
says University of Oxford archaeologist Rick 
Schulting, who was not part of the research. 
“This is a nice study that demonstrates you 
can have alternate pathways to complexity.” 

The discoveries deep in Siberia are part 
of a wider re-evaluation of how complex 
societies arose. Predictable harvests and 
storable surpluses were needed, traditional 
thinking went, to support large sedentary 
populations, monumental architecture, 
and stratified societies—all of which made 
up what archaeologists called the Neolithic 
package. “If you found something like this 
in the Near East, as part of a farming soci- 
ety, it wouldn’t be a surprise,” says co-author 
and Free University of Berlin archaeologist 
Henny Piezonka. 


In recent years archaeologists had docu- 
mented dozens of fortified settlements in 
central Siberia, an expanse of pine for- 
est crisscrossed by rivers and pocked with 
permafrost and swamps, more than 2500 
kilometers east of Moscow. Researchers gen- 
erally assumed the forts were beyond the 
capabilities of Stone Age foragers and thus 
only a few thousand years old at most, dat- 
ing from after metal tools first appeared in 


the region. “Hunter-gatherers are still seen . 


as simple people who had no impact on their 
environment,” says Free University Berlin 
archaeologist Tanja Schreiber, a co-author of 
the new study. 

One fort sits on a high spit of land over- 
looking a bend in the Amnya. In 2019, 
Piezonka and a team of Russian and German 
researchers visited the site, days by boat 
and helicopter from the nearest city. They 
documented the defensive architecture, a 
network of deep ditches, banks, and _ pali- 
sades surrounding a cluster of houses. They 
also collected wood and charcoal from the 
settlement’s lowest, and therefore earliest, 
layers, which were visible as bands of black 
organic material in the promontory’s white 
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sand. “It’s like they’re drawn with a ruler,’ 
Piezonka says. 

Radiocarbon dating showed the site’s 
earliest walls and houses were built around 
6000 B.C.E. At that time, local people lived 
by hunting, fishing, and gathering wild 
plants—a lifestyle still partially practiced by 
Nenets and Khanty people in the area today. 

The Siberian findings add to others that 
challenge agriculture’s primacy in driving 
settlements and cultural complexity. In 
Anatolia, the monumental religious struc- 
tures of Gobekli Tepe were built even ear- 
lier, at 9000 years B.C.E. But those people 
were beginning a transition to agriculture. 
In contrast, beginning about 10,000 years 
ago, hunter-gatherer societies in coastal 
areas around the world, including the Ko- 


Rich but remote 

The Amnya sites lie on a sharp bend in the Am- 
nya River in central Siberia, a region untouched 
by the Neolithic farming revolution thousands of 
years ago. Yet early hunter-gatherers there built 
massive fortifications, apparently to protect rich 
stores of fish oil and meat. 


Finland Amnya sites 
e Saint 
Petersburg Rrecia 
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rean peninsula, the Japanese archipelago, 
and later Scandinavia, drew on marine re- 
sources to support large settlements. More 
recently, complex, hierarchical societies on 
the northwest coast of North America lived 
in large, permanent, and sometimes forti- 
fied settlements, all sustained by hunting, 
gathering, and fishing. 

Yet North American societies like the 
Kwakwaka’wakw, Coast Salish, and Tlingit 
were seen as outliers on an evolutionary 
ladder that led from foraging to farming to 
complex states and the origins of modern 
society. “The Pacific Coast is always seen 
as an exception, not as evidence of another 
spectrum of diversity,’ Piezonka says. 

That view of the past as a standardized 
progression has begun to change, a shift 
captured in the 2021 book The Dawn of Ev- 
erything: A New History of Humanity by 
archaeologist David Wengrow and the late 
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anthropologist David Graeber. “We can now 
see there are many societies in the archaeo- 
logical record who are hunter-gatherers but 
have many of the features we traditionally 
assumed were associated with farmers,” 
says University of Cambridge archaeologist 
Graeme Barker. 

In Siberia, the abundant resources pro- 
vided by the taiga may help explain the 
complexity reflected in the forts. Annual 
fish runs yielded dried fish, fish oil, and 
fish meal—all high-calorie, long-lasting 
foods. Reindeer, elk, and waterfowl migra- 
tions presented predictable opportunities 
to harvest still more meat to smoke and 
store for the long winter. “They don’t have 
to grow or raise resources,” Piezonka says. 
“The surrounding environment provides 
them seasonally. It’s like 
harvesting nature.” 

At the Amnya site, she 
and her colleagues recovered 
dozens of decorated clay pots 
with pointed and flat bot- 
toms from the earliest lay- 
ers of the pit houses, where 
they were presumably used 
to store the abundant food. 
Once thought to be part of 
the Neolithic package, pot- 
tery may not be exclusive to 
farmers: East Asian hunter- 
gatherer cultures began to 
make pots during the last 
ice age. “Pottery and forts 


Omsk are like an alternative Neo- 

e lithic package,” Piezonka 

says. At Amnya, her team 

Kazakhstan also noted a possible sign of 

0 500 social stratification, another 
—S es 


km development often linked 
to agriculture: a cluster of 
houses that sat, undefended, 
outside the palisade. 

The fortified settlements, often situated 
overlooking rivers, might have been ways to 
stake out productive fishing spots. “When 
you start to get large numbers of people 
and storage of resources, you start to get 
into the world of competition,” Barker says. 
“Part of that is going and taking.” 

A centurieslong cold spell that started 
about 8200 years ago may have made such 
rich sites particularly desirable. At Amnya 
and other fortified settlements, burned 
layers show that pit houses and palisades 
were periodically consumed by flames, and 
archaeologists found arrowheads in the 
Amnya’s outer ditch—possible signs of vio- 
lent conflict. “These things we think about 
now, like property ownership and social 
inequality—people have been thinking 
about since we became human,” Colin Grier 
of Washington State University says. 


AGRICULTURE 


Maize has an 
unexpected 
wild ancestor 


Genes from second wild 
grass may have helped 
propel its success—but 
scientists don’t know how 


By Lizzie Wade 


aize is one of the world’s most im- 
portant crops, but its origins have 
long bedeviled scientists. It took 
more than a century for scientists 
to settle on the idea that it was do- 
mesticated about 9000 years ago in 
the lowlands of Mexico from a wild grass: 


rb 


a subspecies of teosinte called parviglumis. : 


But now, a team of geneticists has compli- 
cated that history, reporting this week in 
Science that maize as we know it has a sec- 


ond wild ancestor. Between 15% and 25% ~ 


of the genes in all existing maize varieties 
come not from parviglumis, but from a 
highland subspecies of teosinte called mexi- 
cana, which hybridized with maize some 
4000 years after people first domesticated 
the plant. 

Maize is “such a well-studied and promi- 
nent plant” that it’s surprising to learn it had 
a long-lost relative, says Logan Kistler, an 


anthropologist who studies plant domesti- : 
cation at the Smithsonian Institution’s Na- 


tional Museum of Natural History. “There’s 
still something this basic to learn about 
maize—that’s wild.” But why the new hybrid 
spread so far and wide is still unclear. 
Jeffrey Ross-Ibarra, an evolutionary bio- 


logist at the University of California, Da- . 


vis, started to study the relationship of the 
mexicana teosinte subspecies to maize in 
order to understand how the lowland do- 
mesticate adapted to the chilly highlands 
of central Mexico. But, “We kept finding 
evidence of this second teosinte in other 
places we looked,” he says. The team ex- 
amined nearly 1000 maize genomes from 
traditional varieties, modern cultivars, 
and ancient plant remains excavated from 
the southwestern United States to eastern 
Brazil. Unexpectedly, mexicana ancestry is 
“absolutely everywhere,” Ross-Ibarra says. 
Reconstructions of the maize family tree 
suggest it first mixed with the highland 
teosinte between 6000 and 4000 years 
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ago. Indeed, the only maize sample the 
team found without mexicana ancestry was 
a 5500-year-old cob from the southern coast 
of Peru, thousands of kilometers away from 
where the hybridization was taking place. 

Together, the genetic data and archaeo- 
logical evidence suggest maize moved out of 
Mexico in two waves. After it was domesti- 
cated from parviglumis in the Balsas River 
Basin in what is now the state of Guer- 
rero about 9000 years ago, maize quickly 
spread south along the Pacific coast, reach- 
ing Panama by 7800 years ago and Peru by 
6700 years ago. The ancient cob from Peru 
was the result of this first wave. Then, start- 
ing about 6000 years ago, maize moved up 
into Mexico’s highlands, where it crossed 
with the local mexicana teosinte. 

Shortly after, this new hybrid maize ex- 
ploded out of central Mexico, mixing with or 


Like all existing maize, a variety from Oaxaca, Mexico, has two different 
wild ancestors. 


replacing every first-wave variety in Central 
and South America. It also headed north, 
reaching the southwestern U.S. about 4000 
years ago. It all amounts to “a much more 
complete panorama of maize’s evolutionary 
history,’ says co-author Miguel Vallebueno- 
Estrada, a paleogenomicist at the Gregor 
Mendel Institute of Molecular Plant Biology. 
Just why the new hybrid varieties spread 
so widely is a mystery. “It made sense that 
introgression from mexicana was impor- 
tant for adaptation to the highlands,” says 
Maud Tenaillon, a population geneticist 
who studies maize at CNRS, France’s na- 
tional research agency, and Paris-Saclay 
University. “But that it’s everywhere—it’s 
something no one would have guessed.” 
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One would think “this new maize must 
have had an incredible advantage” over first- 
wave varieties, Tenaillon says. But that’s not 
what Ross-Ibarra’s team found. Genetic 
analyses and ancient cobs show first-wave 
maize already had large ears, soft kernels, 
and other desirable traits that differentiate 
maize from teosinte. The researchers iden- 
tified a few possible advantages in second- 
wave maize, including slightly bigger cobs, 
more kernels per row, and the ability to 
withstand more hours of sunlight, which 
could have helped as it moved north and 
south to places with long summer days. But 
nothing stands out as truly transformative. 
“Frustratingly, we don’t have a smoking-gun 
answer,’ Ross-Ibarra says. 

The timing might hold a clue. The meai- 
cana hybridization happens “right on the 
eve of a big transition to more sedentary 
agriculture,” says Andrew 
Somerville, an archaeo- 
logist at Iowa State University. 
The people who domesticated 
maize—and, later, crossed it 
with mexicana—were forag- 
ers who ate a diversity of wild 
foods while also growing small 
patches of maize and other 
plants. But between 4700 and 
4000 years ago, as second-wave 
maize was spreading, the plant 
became the main staple for 
some Mesoamerican communi- 
ties. Many others followed suit, 
and before long maize agricul- 
ture was the dominant way of 
life in much of the Americas. 

Second-wave maize might 
have had qualities that made 
the invention of agriculture 
possible, Ross-Ibarra says. “My 
best guess is that [before meai- 
cana hybridization], you had a 
domesticated maize, but it was 
wimpy or not super-reliable,” 
perhaps because of inbreeding 
and a limited gene pool, he explains. The in- 
flux of genetic variation from the highland 
teosinte may have “turned it into something 
that is really dependable.” Or perhaps the 
early farmers who spread maize through 
migration or trade simply preferred the new 
varieties for cultural reasons. Ross-Ibarra is 
now working with archaeologists and hu- 
man geneticists to trace the relationship of 
maize and people over time. 

The new picture of its origins is a re- 
minder of how much consumers the world 
over owe to ancient Indigenous farmers, 
Vallebueno-Estrada says. “Maize is the com- 
pendium of the work done by so many peo- 
ple over thousands of years. It’s thanks to 
them that we have maize today.” 


SPACE 


Unique Moon 
sites could be 
‘lost forever’ 

in mining rush 
Researchers seek protection 


for pristine areas on Moon’s 
far side and polar regions 


By Daniel Clery 


cience and commerce may be headed 

for a clash on remote terrain: the 

Moon. For the first time in half a cen- 

tury, NASA is sending a craft to the 

lunar surface, with the launch at the 

end of this year of Peregrine Mission 
1, a lander built by the private company As- 
trobotic Technology. Dozens of other craft 
will soon follow, many as part of NASA’s 
Artemis program to return astronauts to 
the Moon. Most researchers are looking 
forward to a new golden age of exploration 
and science. But some are worried. 

They foresee that the advent of private 
landers will lead to a “Moon rush,” as com- 
panies race to grab valuable minerals and 
resources while trampling over scientifi- 
cally important lunar sites. With space law 
offering little or no protection to these ar- 
eas, researchers are starting to lobby gov- 
ernments and international agencies to do 
something before it’s too late. 

“This is urgent,’ says astronomer 
Richard Green of the University of Arizona, 
who is setting up a lunar sites working group 
for the International Astronomical Union 
(IAU). Astrophysicist Martin Elvis of the Cen- 
ter for Astrophysics | Harvard & Smithson- 
ian has been highlighting the issue. “We’ve 
got to point out the uniqueness of these sites. 
They could be lost forever,’ he says. 

Many of the lunar missions—public and 
private—expected to touch down in the 
coming years aim to find resources that 
could support astronauts or be mined com- 
mercially and returned to Earth, such as 
rare-earth elements or helium-3 for still- 
theoretical fusion reactors. NASA’s Artemis 
program explicitly encourages exploitation 
of lunar resources. In 2020, the agency 
awarded an initial set of contracts to four 
companies to collect lunar material for the 
agency’s use. 
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Observatories, such as this proposed liquid-mirror telescope, would need to be far from the dust and vibration of mining operations. 


Scientists fear such mining could 
threaten unique spots such as craters close 
to the north or south pole whose interiors 
are permanently in shadow. Some remain 
constantly below -225°C, making them the 
coldest places in the Solar System. Orbiting 
probes have shown that their chilly depths 
hold large reserves of ice, perhaps accu- 
mulated over billions of years as asteroids 
transported water from the outer Solar Sys- 
tem. If so, those craters hold an invaluable 
record of the delivery of water to Earth. 

Permanently shadowed craters are also 
ideal locations for space-based infrared 
telescopes that need extreme cold for their 
operation. The recently launched JWST re- 
quires a vast multilayered sunshield and a 
mechanical cryocooler to keep its 6.5-meter 
mirror and instruments cold as it orbits 
the Sun, but in a deep lunar crater, a much 
larger scope, maybe 100 meters across, 
would need no help to stay cool and would 
have sharp enough vision to study the sur- 
faces of Earth-like planets around other 
stars. “I’m not aware of any other body in 
the Solar System with these characteristics,” 
says Jan Harms of the Gran Sasso Science 
Institute, who hopes to build another kind 
of detector—the Lunar Gravitational-wave 
Antenna—in a seismically quiet shadowed 
crater, far from any other lunar activities. 
“The Moon ... is a unique body.” 

But these deep, dark craters—and the wa- 
ter contained within them—will also be a 
prime target for anyone planning a lengthy 
stay on the Moon. As well as providing drink- 
ing water, crater ice could be broken down 
into oxygen for life support and hydrogen 
for rocket propellant, making the areas 
neighboring the craters the ideal site for a 
Moon base. Exploiting the ice would sacri- 
fice invaluable science, says planetary scien- 
tist Paul Hayne of the University of Colorado 
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Boulder, who along with colleagues has 
called for scientists to be given time to study 
the ice before any extraction starts. The min- 
ing could also cause vibrations that would 
drown out gravitational wave signals and 
kick up gritty Moon dust that could settle on 
and spoil telescope mirrors. 

Radio astronomers are eager to fence off 
another area: the middle of the Moon’s far 
side, shielded by its bulk from the hectic 
radio noise of Earth-bound transmitters. 
Astronomers envision huge crater-filling 
radio dishes and arrays of antennas hun- 
dreds of kilometers across that could eaves- 
drop on the faint hiss from the neutral 
hydrogen gas that filled the universe before 
the first stars lit up. By mapping how and 
when that gas disappeared—ionized by 
starlight—they can watch the emergence of 
the first galaxies. 

The area’s pristine radio quietness is 
already under threat. Space agencies and 
companies are planning fleets of Moon- 
orbiting satellites to help rovers navigate 
on the surface and to relay their data to 
Earth. The International Telecommunica- 
tion Union (ITU) has, since 1971, designated 
the far side of the Moon a “radio quiet zone” 
to protect it for future radio astronomy. But 
even if satellites avoid key frequencies or 
switch off as they fly above telescopes, un- 
intended emissions from onboard elec- 
tronics could be detectable. The ITU rules 
“need to be updated,” says political scientist 
Alanna Krolikowski of the Missouri Univer- 
sity of Science and Technology. 

The 1967 Outer Space Treaty prevents 
nations from making territorial claims on 
celestial bodies but has little to say about 
space mining, which was the stuff of sci- 
ence fiction at the time. The United States 
and a few other nations have argued that 
extracting resources does not require claim- 


ing sovereignty, noting that a country can 
fish in the open ocean without asserting 
ownership. To reinforce the U.S. stance, the 
2015 SPACE Act explicitly allows companies 
to extract resources from space and profit 
from them. 

NASA’s Artemis Accords—a set of explo- 
ration guidelines also signed by 31 other 
countries—do provide protection to historic 


artifacts, such as the Apollo landing sites, ~ 


and proclaim a commitment to a “sustain- 
able environment in space.” But they do not 
explicitly protect areas of scientific value. 

The United Nations could provide further 
guidance, through its Committee on the 
Peaceful Uses of Outer Space. In a paper in 
press with the Philosophical Transactions of 
the Royal Society A, Elvis and Krolikowski 
argue the U.N. committee could designate 
lunar sites of extraordinary scientific im- 
portance. But Krolikowski notes that the 
U.N. “moves on decadal time frames.” 

Telescope builders are hoping IAU can 
make their case. The working group led 
by Green held its first meeting on 27 No- 
vember. At the top of its agenda is defin- 
ing the problem: precisely which sites need 
protecting, for instance, or what level of 
radio interference would be harmful to far 
side telescopes. “We need criteria for judg- 
ment,’ Green says. “We need a place to exert 
claims, have them considered, and then live 
with the result.” 

Scientists see plenty to celebrate in space 
agencies’ return to the Moon. “I think the 
vast majority of planetary scientists view 
these [launch] projects as opportunities,” 
says Parvathy Prem of Johns Hopkins Uni- 
versity. But Krolikowski hopes the resource 
rush won’t mar that promise. “Everyone 
understands in their gut that science is 
worth protecting,” she says. “Short-term 
interests shouldn’t take precedence.” 
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Cheaper microscope may bring 
protein mapping to the masses 


Team solves first protein structures with lower cost device 


By Eric Hand 


alk to any structural biologist, and 

they’ll tell you how a cool new method 

is taking over their field. By flash 

freezing proteins and bombarding 

them with electrons, cryo-electron 

microscopy (cryo-EM) can map pro- 
tein shapes with near-atomic resolution, of- 
fering clues to their function and revealing 
bumps and valleys that drug developers can 
target. The technique can catch wriggly pro- 
teins in multiple configurations, and it can 
even capture those that have been off-limits 
to traditional x-ray analysis because they 
stubbornly resist being crystallized. Many re- 
searchers expect cryo-EM will surpass x-ray 
crystallography in the number of new protein 
structures solved next year. 

Yet for all its charms, cryo-EM has flaws: 
The freezing process is finicky, and the 
microscopes are expensive. High-end ma- 
chines can cost more than $5 million to 
buy, about as much to install, and hundreds 
of thousands per year to operate and main- 
tain. Many U.S. states—and countries—don’t 
have a single cryo-EM microscope. “The 
haves and have nots is what it is right now,” 
says Rakhi Rajan, a structural biologist 
at the University of Oklahoma, which cur- 
rently lacks one. 

Researchers at the Medical Research Coun- 
cil’s Laboratory of Molecular Biology (LMB) 
have been working to democratize the field 
(Science, 24 January 2020, p. 354). Today, in 
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the Proceedings of the National Academy of 
Sciences, the U.K. team describes cobbling 
together a prototype cryo-EM microscope 
that has solved its first structures. The 
machine—what LMB physicist Chris Russo 
calls a “cheap little hatchback” rather than a 
“Ferrari’—could rival high-end machines in 
capabilities for one-tenth of the cost. 

Russo believes a manufacturer could build 
and sell the design for $500,000. That’s 
within reach of a new hire’s startup pack- 
age, or equipment grants offered by research 
agencies, says Bridget Carragher, founding 
technical director of the Chan Zuckerberg 
Imaging Institute. “It would be a marvelous 
machine,” she says. “Everyone who wants to 
do structural biology could do it” 

Recent advances in artificial intelligence 
(AI) might seem to offer an even cheaper way 
to do structural biology. AI algorithms can 
accurately predict a protein’s structure from 
its amino acid sequence. But because Als are 
trained on known structures, their predic- 
tions sometimes falter with unusual protein 
configurations, Russo says, and they are still 
not substitutes for cryo-EM. 

To boost access to high-end cryo-EM mi- 
croscopes, the National Institutes of Health 
has created three national centers where 
have-not researchers can send samples. But 
the hub-and-spokes system comes with prob- 
lems. Rajan often spends months waiting for 
results from the national centers, only to find 
out her samples were duds. Although she is 
getting better at freezing proteins, Rajan 


U.K. researchers say their prototype cryo-electr 
microscope could be built and sold for $500,000. 


reckons that less than 10% of her samples 
have resulted in good data. 

That’s why, even if researchers cannot af- 
ford a top cryo-EM microscope, many want a 
screening machine that could at least check 
the quality of the samples before sending 
them off to national centers for higher resolu- 
tion images. That was a primary motivation 
for Russo and his colleagues, who include 
Richard Henderson, a Nobel laureate at LMB 
who pioneered cryo-EM. One of the team’s 
key insights was that the electron beam 
does not need the energies typically used 
in high-end cryo-EM microscopes. Levels 
of 100 kiloelectronvolts (KeV)—one-third as 
high—suffice to reveal molecular structure, 
and they reduce costs by eliminating the 
need for a regulated gas, sulfur hexafluoride, 
to snuff out sparks. The team also saw room 
for improvement in the system of lenses that 
focuses the electrons and the detector that 
captures them after they probe the sample. 

With the resulting prototype, the LMB 
group determined structures of 11 diverse 
proteins. One was the iron-storing protein 
apoferritin, which is used as a cryo-EM 
benchmark. The LMB researchers mapped it 
at 2.6 angstroms—2.6 times the diameter of 
a hydrogen atom. That’s not as high as the 
record cryo-EM resolution of 1.2 angstroms, 
but plenty good enough to make an atomic 
model, Russo says. And the process was fast. 
Because the microscope sat in the same lab 
as the freezing stage, the team could quickly 
check that its samples were good enough, 
rather than waiting weeks for results from 
a high-end machine. “Every single structure 
was done in less than a day,’ Russo says. 

Thermo Fisher Scientific, which makes a 
top-end machine, says it is already expanding 
the cryo-EM market. In 2020, it began to sell 
a lower cost option, called Tundra, that oper- 
ates at 100 KeV. “There are universities that 
probably never believed they could own cryo- 
EM that now have the tools,” says Trisha Rice, 
a vice president who heads the company’s 
cryo-EM business. Indeed, Rajan’s university 
just ordered one for $1.5 million. 

Russo says Tundra is a step in the right 
direction, but his team’s innovations could 
make cryo-EM even cheaper. For example, 
he says, Tundra dials back the energy on 
a simplified version of the costly electron 
source used in top-end microscopes, whereas 
the electron gun on the LMB prototype was 
designed for 100 KeV from scratch. But he 
understands that commercializing his team’s 
design would require large investments by 
potential manufacturers. “We're talking to all 
of them,’ Russo says. “But at the end of the 
day, it’s up to them.” 


science.org SCIENCE 


: ral 


upd 


ry 


NEWS | IN DE) 


IMAGE: MATERIALS PROJECT/BERKELEY LAB 


ARTIFICIAL INTELLIGENCE 


DeepMind predicts millions of new materials 


Al-powered discovery could lead to revolutions in electronics, batteries, and solar cells 


By Robert F. Service 


he materials cookbook has suddenly 
grown tens of times longer. Modern 
technologies, from electronics to air- 
planes, draw on just 20,000 inorganic 
materials, largely discovered through 
trial and error; scientists have pre- 
dicted but not made tens of thousands more. 
This week, however, researchers report that 
with a new artificial intelligence (AD), they 
have predicted the ingredients and prop- 
erties of another 2.2 million materials. In 
a companion study, a separate team has 
shown that such predicted materials can be 
made efficiently, again with the help of AI. 

Together, researchers say, the re- 
ports foreshadow a new age of ma- 
terials science, when AI programs 
and robots will power the search 
for the makings of novel batter- 
ies, superconductors, and catalysts. 
“Tt’s very impressive,” says Andrew 
Rosen, a computational materials 
scientist at Princeton University. 

The predictions, published in 
Nature, are another coup for the 
AI innovators at DeepMind, an off- 
shoot of Google. Last month, they 
described an AI algorithm that 
runs on laptops and can predict 
the weather as accurately as large, 
supercomputer-driven models 
(Science, 17 November, p. 748). 
Prior to that DeepMind developed 
AlphaFold, an AI that’s able to pre- 
dict the 3D shape of hundreds of 
millions of different proteins from 
their amino acid sequence alone (Science, 
30 July 2021, p. 478). The new work, Rosen 
says, “is the AlphaFold equivalent for mate- 
rials science.” 

Like previous DeepMind achievements, 
this one trained an AI with extensive data. 
The researchers started with the Materials 
Project, a database of all known and pre- 
dicted inorganic crystals. That database 
includes not only each material’s crystal 
structure, but also properties such as its 
electronic structure, magnetic behavior, and 
hardness. Over the past decade, Materials 
Project teams have fed data on the 20,000 
known inorganic crystals into pattern- 
matching machine learning algorithms to 
predict another 28,000 inorganic crystals 
that should be stable. 
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For their current work, DeepMind re- 
searchers, led by Dogus Cubuk, who heads 
materials discovery for the company, used 
the data on those 48,000 known and pre- 
dicted compounds—as well as information 
from other related databases—to train an 
“active learning” AI model. Dubbed GNoME 
(for Graph Networks for Materials Explora- 
tion), the AI can spot patterns beyond those 
in the original training data. It made an 
initial round of predictions of possible new 
stable crystals and calculated their proper- 
ties; the team then added the results to the 
training data and repeated the cycle. 

After several such rounds, GNoME wound 
up with predictions for the 2.2 million new 


Barium (blue), niobium (white), and oxygen (green) form a novel material. 


compounds. The calculated “formation 
energy’—a measure of stability—for 381,000 
of them suggested that if researchers could 
synthesize them, they should be stable and 
not decompose into other structures. 
Among the finds are layered materials like 
those used in battery electrodes. Whereas 
the Materials Project identified 1000 such 
compounds, GNoME predicted 52,000, in- 
cluding 528 lithium-ion conductors, a kind 
of material critical to today’s best batteries. 
Cubuk also notes that in contrast to previ- 
ous predicted crystals, which mostly com- 
bined two, three, or four elements, many of 
DeepMind’s predicted structures contain 
five and even six elements. “This is really 
exciting,” says Alexander Ganose, a mate- 
rials chemist at Imperial College London. 


“Tt is enabling materials discovery across a 
much wider composition range. We might 
be able to find the materials of the future in 
this data set.” 

The next step is actually synthesizing the 
materials, traditionally a process of trial 
and error that can take months or years 
for a single compound. “Predicting some- 
thing is nice,’ says Janine George, a com- 
putational materials scientist at the Federal 
Institute for Materials Research and Testing 
in Berlin. “But making it is really great.” 

External benchmarks suggest GNoME’s 
success rate at predicting stable structures 
reaches 80%, up from the 50% achieved by 
previous algorithms. And the DeepMind 
researchers note that indepen- 
dent experimenters have already 
made 736 of the predicted materi- 
als, verifying their stability. Cubuk 
says even materials not certain to 
be stable might be extremely long 
lasting, just as diamond survives 
1 billion years or more before de- 
composing into graphite. 

A different kind of AI might help 
synthesize more of GNoME’s pre- 
dictions, another paper this week 
in Nature suggests. Researchers 
at Lawrence Berkeley National 
Laboratory, led by materials scien- 
tist Gerbrand Ceder, recently built 
an Al-driven robotics lab to make 
predicted new materials (Science, 
21 April, p. 230). Now, he and his 
colleagues report that this setup 
quickly learned to refine recipes for 
synthesizing new compounds pre- 
dicted by the Materials Project algorithm. In 


17 days, the robots successfully synthesized . 


41 materials out of 58 they attempted. 

DeepMind researchers say they will im- 
mediately release data on the 381,000 com- 
pounds predicted to be stable and make the 
code for its AI publicly available. They may 
ultimately release all 2.2 million recipes. 
But Ganose, for one, does not want to wait. 
Studying the whole panoply could help sci- 
entists better determine what allows some 
compounds to be stable whereas other are 
less so. “If this is locked away that’s a real 
loss to science,’ Ganose says. Cubuk, how- 
ever, notes that with almost 10 times more 
targets to aim for than ever before, materi- 
als scientists already have plenty to keep 
their test kitchens busy. 
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AN ALKALINE. 
SOLUTION 


As alarm about climate change grows, scientists explore a strategy 
for drawing excess carbon dioxide into the ocean 


tanding on the aft deck of a modi- 
fied 13-meter fishing boat in Halifax 
Harbour, Dariia Atamanchuk gazes 
at both a cause of the climate crisis 
and, she hopes, part of the solution. 

On the nearby shore, three red- 
smokestacks 


and-white-striped 
rise like enormous barber poles, 
funneling carbon dioxide (CO,) 
from a natural gas-fueled power plant 
into the pale morning sky. At the seawall 
in front of the plant, seawater used to cool 
its piping flows into the harbor. Normally 
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By Warren Cornwall 


that water runs clear. But today, the out- 
flow roils in a pink froth, like a cauldron of 
Pepto Bismol. “Ooh, that’s very milky,’ says 
Atamanchuk, a chemical oceanographer at 
Dalhousie University. 

The colorful eruption, part of an experi- 
ment led by the Halifax, Canada-based 
company Planetary Technologies, contains 
red rhodamine dye and a slurry of white 
magnesium hydroxide, the main ingre- 
dient in the drug store antacid Milk of 


Magnesia. The alkaline mineral should, 
in theory, raise the pH in the surrounding 
seawater, triggering a chemical reaction 
that will absorb CO, from the atmosphere 
and convert it to bicarbonate, an ion that 
can float through the ocean undisturbed 
for millennia. 

In the past few years, this relatively sim- 
ple reaction has attracted a swarm of philan- 
thropists, government officials, and climate 
entrepreneurs. Growing alarm about the 
pace of climate warming and the prospect 
of a market for carbon-removal credits 
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A pink alkaline slurry was released into Halifax 
Harbour during an experiment in October. 


that could reach $1 trillion by the middle 
of the next decade are driving much of 
the activity. Researchers like Atamanchuk, 
who once had little interest in tinkering 
with the climate, are joining the throng, 
drawn by a flood of research money and a 
desire to understand the promise and per- 
ils of pouring alkaline substances into the 
world’s oceans in quantities that could rise 
to billions of tons. 

“Tt’s like being in the Klondike now,” says 
Dalhousie ecologist Hugh MacIntyre, refer- 
ring to the gold rush that brought prospec- 
tors to remote northwestern Canada in the 
1890s. “It seems like there’s a new initiative 
every 6 months.” 

Proponents argue that the approach, if 
implemented on a grand scale, could ab- 
sorb as much greenhouse gas from the 
atmosphere as the Amazon rainforest, help- 
ing keep global warming to under 2°C. They 
also contend that it involves fewer risks 
than other possible methods for pulling CO, 
into the oceans, such as fertilizing waters to 
spark giant algae blooms. 

But questions remain about how much 
carbon could really be captured and what en- 
vironmental side effects may be in store if hu- 
mans start to tinker with the ocean on such 
a huge scale. The Halifax experiment, one of 
the first of its kind, offers a glimpse of that 
potential future and the flood of unanswered 
scientific questions swirling around it. 


THE OCEAN ALREADY contains roughly 
40 trillion tons of carbon, an amount that 
dwarfs the roughly 1 trillion tons in the air 
and 3.5 trillion tons in land ecosystems. 
Some comes from marine plants and ani- 
mals whose dead bodies and feces sink to 
the ocean floor. But the vast majority—more 
than 90%—is in the form of two inorganic 
ions, carbonate and bicarbonate. 

That trove of stored carbon has been in- 
creasing, saving us from the worst effects of 
climate change. As humans pump CO, into 
the air, some of it is absorbed at the ocean’s 
surface, where it goes through a cascade of 
reactions that shuttle the carbon from one 
molecule to another, eventually lodging 
it chiefly in bicarbonate. The process has 
soaked up about 30% of all carbon emis- 
sions since the mid-1800s. 

The reactions depend on the presence 
of alkaline minerals found in rocks such as 
limestone, which are dissolved by rainwater 
and washed into the sea. Once dissolved, 
these minerals release hydroxide ions that 
can join with CO, to make bicarbonate. 

This natural process will eventually con- 
sume most of the excess CO) we are pump- 
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ing into the atmosphere. But that will take 
thousands of years. So some scientists and 
entrepreneurs want to speed up the process 
with additional hydroxide ions—the alkaline 
equivalent of a shot of adrenaline. 

Mike Kelland, CEO of Planetary Tech- 
nologies, is one. Standing near the power 
plant’s smokestacks, he pours the fuel for 
his company’s Halifax experiment into his 
hand. The flourlike white powder, a mineral 
form of magnesium hydroxide called bru- 
cite, is combined with seawater in van-size 
plastic tanks, creating a mixture resembling 
a vanilla milkshake. A dash of nontoxic dye 
is added so observers can easily track how it 
spreads in the ocean, and the pink cocktail is 
gradually poured into a stream of seawater 
leaving the power plant. 

Over nearly 2 months, Kelland’s com- 
pany plans to send 280 tons of brucite into 
the harbor, a prelude to bigger releases in 
coming years. He estimates the October 


hancement. In the past 4 years, at least four 
other North American companies have be- 
gun to pursue similar projects. One, Project 
Vesta, in 2022 spread an alkaline mineral 
called olivine on a Long Island beach to test 
whether that would result in more alkaline 
waters as waves wash the material out to 
sea. Another, Calcarea, is developing a way 
to funnel CO,-rich exhaust from oceangoing 
ships through an alkaline solution before 
pumping it overboard. Several companies 
want to build seawater treatment factories 
that will use electricity to boost alkalin- 
ity and capture waterborne CO». One, Los 
Angeles-based Equatic, has small test plants 
running in California and in Singapore. 
Equatic is preparing to build a larger 
plant in Singapore capable of capturing 
5000 tons of CO, per year. “It’s at the level 
and scale of what you would do before you 
launch full-scale commercial products,” says 
Gaurav Sant, an engineer at the University 


The alkaline mineral brucite can raise the pH of ocean water and trigger carbon dioxide—capturing reactions. 


event will coax the seawater to consume 
more than 200 tons of extra CO, from the 
atmosphere. After taking into account the 
emissions produced from mining and ship- 
ping the brucite from China, the final tally, 
he says, should come to at least 100 tons 
less CO. warming the planet. “It is very 
much trial scale,” says Kelland, a software 
entrepreneur and electrical engineer who 
founded the company in 2019. But if the 
process were expanded to larger facilities 
and around-the-clock operations, “you sort 
of look at that and go, ‘Wow, this is a poten- 


200 


tial game changer. 


KELLAND IS FAR from alone in seeing a busi- 
ness opportunity in ocean alkalinity en- 


of California, Los Angeles, who co-founded 
the company. 

If these companies are successful in scal- 
ing up their operations, the industry that 
emerges will likely be massive. For any 
strategy to take a worthwhile bite out of the 
problem, carbon capture proponents often 
point to a benchmark of capturing 1 billion 
tons of CO, per year—approximately 2.5% 
of current annual emissions. Reaching that 
goal would require mining approximately 
1 billion tons of alkaline rock per year or, 
in the case of Equatic’s approach, devoting 
nearly 8% of world electricity production to 
treating seawater. 

First, companies need to convince pro- 
spective carbon credit buyers in industry 
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Dariia Atamanchuk (left) lowers a device that will measure pH and other aspects of ocean chemistry into Halifax Harbour. Mathieu Dever (right)—an oceanographer with 
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the instrument company RBR—prepares to deploy a remote-controlled underwater glider that will track changes in ocean chemistry. 


and government that they can measure 
how much extra CO, is actually being 
captured. “The biggest question is how 
do we confirm that the carbon dioxide re- 
moved is additional to what would happen 
naturally,” says Sarah Cooley, a chemical 
oceanographer and head of the climate sci- 
ence program at the environmental group 
Ocean Conservancy. 

That’s where Atamanchuk comes in. 
Spread in front of her on the deck of the 
fishing boat in Halifax Harbour is an ar- 
ray of sensor-laden tools. As part of an 
$11 million research initiative funded by 
a consortium of philanthropies called the 
Carbon to Sea Initiative, she aims to pin- 
point the added brucite and track how the 
water chemistry evolves as it spreads. 

She starts by easing a_ 1-meter-tall 
metal frame into the water, outfitted with 
equipment to track the water’s cloudi- 
ness, salinity, temperature, and pH near 
the power plant outfall. Nearby, two col- 
leagues watch as a gray, torpedolike cyl- 
inder descends beneath the waves. It will 
retrieve a water sample from 16 meters’ 
depth, to be screened for alkalinity, pH, 
salinity, carbon, and rhodamine. Another 
remote-controlled neon pink device mo- 
tors away from the boat on a programmed 
route through the harbor, sucking up still 
more measurements. 

“The biggest challenge I see is detecting 
a signal,” Atamanchuk says. In coastal Nova 
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Scotia, natural pH levels vary wildly as tidal 
currents sweep in and out, changing by as 
much as 70% in 6 hours. By comparison, the 
added brucite might raise the water’s pH by 
0.1—roughly 25%. 

There is also no guarantee that all the 
brucite poured into the harbor will trans- 
late into CO, pulled from the atmosphere. 
Currents could suck the more alkaline wa- 
ter into the depths, below the critical sur- 
face zone, where gases are exchanged with 
the air. Or some of the mineral might fail to 
dissolve and sink to the seabed instead. 

If it settles out, the brucite could even 
interrupt chemical reactions that dissolve 
naturally occurring alkaline materials in the 
sediment, which would cancel out some of 
the climate benefit. “Is that not a big deal? Is 
it a big deal? We don’t know,” says Dalhousie 
biogeochemist Chris Algar, who is studying 
that question in Halifax Harbour. 

The data from Atamanchuk’s fleet of sen- 
sors will be fed into a detailed model of the 
harbor created by Katja Fennel, a Dalhousie 
ocean modeling expert. Fennel has spent 
much of her career predicting low-oxygen 
zones in places like the Gulf of Mexico. Now, 
she wants to predict how the added brucite 
will alter the harbor’s water chemistry. Her 
model is powered by basic ocean physics 
and decades of detailed observations of 
water chemistry and currents, as well as 
short-term forecasts of weather and tides. If 
she’s able to fine-tune the model to perform 


well, the approach could be applied far be- 
yond Halifax, informing decisions about 
what sensors should float near an alkalinity 
source to track chemical changes and guid- 
ing estimates of how much carbon has been 
sucked into the ocean. 

So far, Fennel is pleased with the results. 
From her desk on Dalhousie’s campus, she 
swivels to her computer screen and starts a 
short video. A purple blob expands across a 
map of the harbor, tracing the spread of rho- 
damine from a previous test. Some of the re- 
searchers had expected changing tides would 
cause the dye to pulse back and forth, but the 
model forecasted that the winds would drive 
it to the southeast. The model turned out to 
be right. “We’ve gained some confidence that 
the model has skill,’ she says. 


ANY MISSTEP could derail plans, as the pro- 
ponents of ocean alkalinity enhancement 
are keenly aware. Some point to the furor 
that erupted 11 years ago when a fishing 
boat floating near the Haida Gwaii archi- 
pelago, off the west coast of Canada, dumped 
120 tons of iron sulphate and iron oxide 
into the water. The iron fertilized plankton 
growth, igniting a bloom visible from space. 
Proponents said the approach would boost 
salmon runs and capture COs. 

The private venture sparked a backlash 
from environmentalists who feared un- 
intended ecological harm. The incident 
effectively shut down a whole branch of 
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Ocean engineering 

In an effort to combat climate change, scientists are exploring ways to coax the ocean to absorb 
more carbon dioxide (CO,) from the atmosphere by altering its water chemistry, a strategy known 
as ocean alkalinity enhancement. 


Delivering alkalinity 

Anumber of delivery methods are currently under 
consideration, some ship-based and others coming 
from the shore. (The pink color is included 
for visualization purposes; dye is only used 
in some experiments.) 


1 Ship spreading 3 Beach erosion 
Alkaline materials such as limestone 
or olivine could be ground up or 
dissolved in water and then poured 


into the ocean from a ship. 


2 Alkalinity factories 

Sewage treatment or power plants 
could inject alkaline solutions 

into pipes leading to the ocean. 
Electrochemical treatments could 
also raise the pH of seawater. 


the ocean and dissolve. 


Enhancing the carbon cycle 


Boosting the alkalinity of ocean water can accelerate a natural process (shown below in order) that sucks airborne CO, 


into the ocean and converts it into bicarbonate. 
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1 Maintaining equilibrium 
CO, is exchanged between the atmosphere and surface 
waters. Over time, an equilibrium is reached. 


2 Adding alkalinity 
When scientists dissolve alkaline substances such as 

magnesium hydroxide (Mg(OH).) in the ocean, hydroxide 
ions (OH-) are produced. 
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3 Carbon capture 
Hydroxide ions react with aqueous CO,, creating 


biocarbonate (HCO; ) ions, which can float in the ocean 
for millennia. 


4 Room for more 
With fewer CO, molecules in the water, some CO, 


restore equilibrium. 
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Ground-up alkaline rocks could 
be spread on beaches, where 
they would gradually wash into 


moves out of the atmosphere and into the ocean to 


research on what happens when iron, a 
nutrient whose scarcity limits plankton 
growth in much of the ocean, is added to 
surface waters. Up to then, 13 large-scale ex- 
periments had taken place in various loca- 
tions. None has happened since. 

“T don’t want to see this carbon dioxide 
removal industry get so far ahead of the 
science that another [controversial] event 
happens,’ says Adam Subhas, a chemical 
oceanographer at the Woods Hole Oceano- 
graphic Institution. He is leading a project 
to study the effects of pouring tons of alka- 
line sodium hydroxide into the ocean off the 
New England coast. 

Unlike plankton blooms or massive sea- 
weed farms—other proposed approaches for 
capturing carbon in the ocean—alkalinity 
enhancement doesn’t depend on a cascade 
of hard-to-measure biological processes, 
Subhas says. It is “just simple acid-base 
chemistry.’ It could even benefit ecosystems. 
Alkalinity enhancements have already been 
used to help oyster farmers protect their 
nurseries from acidic seawater and to repair 
rivers and lakes damaged by acid rain. 

But there are a lot of unknowns about 
how living things would react. That’s why 
Dalhousie biogeochemist Tatjana Zivkovié 
has been studying microbes pulled from 
Halifax Harbour. For the past year, the 
postdoctoral researcher has poured water 
samples into small plastic bottles, some 
with an added dose of magnesium hydrox- 
ide. After waiting 48 hours—several life 
spans for the microscopic organisms in the 
water—Zivkovié scans the water with la- 
sers and screens the DNA to figure out how 
much the plankton has grown and which 
species are there. 

So far, nothing has raised alarms. “When 
you look at the phytoplankton, we don’t see 
much community change,” she says. 

It’s a similar message from MacIntyre, 
the Dalhousie ecologist. His team has been 
rearing phytoplankton in the lab, exposing 
them to higher levels of alkalinity, then zap- 


ping them with light to see whether their . 


photosynthetic powers are affected. “We 
have been consistently turning up little to 
no difference,” he says. 

But short-term laboratory experiments 
aren’t the same thing as sustained expo- 
sures in the wild. Some of the most ambi- 
tious research is being led by Ulf Riebesell, 
a marine biologist at GEOMAR Helmholtz 
Centre for Ocean Research Kiel. At three 
locations in the Atlantic Ocean and North 
Sea, his team has lowered rows of 20-meter- 
long plastic cylinders into the water, creat- 
ing giant oceanborne test tubes to which 
alkalinity can be added. 

Although the results are unpublished, 
Riebesell says they suggest the method 
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Planetary Technologies plans to conduct more ocean alkalinity enhancement experiments in Halifax Harbour. 


of alkalinity enhancement could matter 
a lot. His team observed little change in 
the plankton community when it added 
seawater that was “pretreated” with al- 
kalinity and allowed to equilibrate with 
atmospheric COs, a method some scien- 
tists see as a potentially gentler way to 
add alkalinity because the CO,-capturing 
reactions take place before the alkaline 
solution reaches the ocean. But two other 
tests involving the direct addition of alka- 
line substances found effects. A mixture 
simulating olivine, which released large 
amounts of silicate, spurred the growth 
of diatoms, which build their skeletons 
from silica; and a test using calcium hy- 
droxide changed the timing of phyto- 
plankton blooms and lowered the overall 
zooplankton biomass at the highest doses 
of alkalinity. 

Research by Riebesell’s lab also suggests 
metals such as nickel, which can be pres- 
ent in alkaline rocks, could have harmful 
effects. In a recent experiment, two species 
of phytoplankton were insensitive to the 
metal, but one species grew more slowly 
even when exposed to relatively small 
amounts. The results raise the possibility 
that nickel-infused minerals could alter the 
balance of plankton in an ecosystem. 

Still, Riebesell is bullish about this ap- 
proach to capturing carbon, assuming big- 
ger, longer lasting experiments support its 
promise. “I suspect that at the end of the 
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day we will conclude that if done with cau- 
tion and avoiding ecological thresholds, the 
benefits of [ocean alkalinity enhancement] 
will outweigh potential risks.” 


RESEARCH on ocean alkalinity enhancement 
is in its infancy, but there are signs it will 
grow up fast. “If you had predicted 3 years 
ago I would be almost exclusively focused on 
this now, I would have laughed,’ Fennel says. 

The burst of scientific activity has been 
fueled by a growing stream of research 
dollars from companies, universities, phi- 
lanthropists, and governments. The fund- 
ing for Fennel and Atamanchuk’s work 
is part of a $50 million investment from 
the Carbon to Sea nonprofit. Dalhousie 
recently unveiled a CA$400 million ini- 
tiative to study the ocean’s role in climate 
change, including a focus on carbon cap- 
ture methods. In the United States, the 
National Oceanic and Atmospheric Ad- 
ministration in September announced 
$24 million in research grants for ocean car- 
bon removal, with 65% of the money going 
to alkalinity-related work. And in Europe, 
the German government recently commit- 
ted €11.6 million to alkalinity research. 

The surging interest has policymak- 
ers hurrying to keep pace with an indus- 
try for which few regulations have been 
developed. In October, the White House’s 
Office of Science and Technology Policy an- 
nounced it was forming a “fast-track” com- 


mittee to craft a cohesive policy for marine 
CO, removal. “If this is going to develop 
into a new industry, we want to do it ina 
way that is based in good science from the 
very beginning,” says oceanographer Scott 
Doney, the office’s assistant director for 
ocean climate science and policy. 

Two years ago, Doney, then at the Uni- 
versity of Virginia, chaired a National 
Academies of Sciences, Engineering, and 
Medicine panel that issued a report call- 
ing for $1.3 billion in funding to study 
ocean carbon removal. Today he sees the 
beginnings of that investment, and the po- 
tentially thorny challenges it raises. They 
include the need for regulations governing 
experimental and industrial-scale alkalin- 


ity additions, protocols for monitoring any . 


environmental damage, and research into 
the potential repercussions of mining huge 
quantities of alkaline minerals or lavishing 
electricity on increasing the pH of seawater. 

Many industry players are keen to scale 
up and get approval for large projects. 
But they are reluctant to get too far out in 
front, the Ocean Conservancy’s Cooley says. 
Early movers will be the first to navigate 
untested regulatory waters, confront tech- 
nical conundrums, and woo a public wary 
of schemes to rewire the planet to tackle 
climate change. “Everybody’s rushing to be 
second,” she says. “Nobody wants to be first. 
Because they just want whoever’s first to 
break the path.” & 
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In the wild, chinstrap penguins 
(Pygoscelis antarcticus) sleep 
for an average of 4s at a time. | 


Penguins snatch seconds-long microsleeps 


Chinstrap penguins fall asleep thousands of times per day in the wild 


By Christian D. Harding? and 
Vladyslav V. Vyazovskiy?>+ 


onsiderable research efforts have been 
committed to understanding the fun- 
damental biology of sleep. Yet, this 
knowledge is mostly derived from lab- 
oratory studies undertaken in a hand- 
ful of model organisms, such as mice, 
rats, and fruit flies, and in conditions that 
are vastly different from those where sleep 
evolved. Studies of nonmodel organisms in 
natura may help elucidate functions of sleep, 
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but this is still largely uncharted territory 
and presents numerous challenges—from 
unconventional anatomy and physiology to 
distinct ecological specialization and envi- 
ronmental influences. On page 1026 of this 
issue, Libourel et al. (J) report an unusual 
pattern of frequent short bouts of sleep in 
wild chinstrap penguins (Pygoscelis ant- 
arcticus), which calls into question not only 
the current understanding of how sleep ar- 
chitecture is regulated but also the extent to 
which it can be altered before the benefits of 
sleep are lost. 


Sleep seems to be ubiquitous in the ani- 
mal kingdom and is defined by features such 
as lack of movement and relative loss of abil- 
ity to sense and respond to the environment 
(2). Although the environment thus plays a 
critical role in the very definition of sleep, it 
has traditionally been considered relevant 
only because it can be standardized to en- 
able studies into endogenous mechanisms or 
“drives” for sleep—which are thought to re- 
flect primary biological need—unspoiled by 
noisy environments. The limitations of this 
approach are made clear when species are 
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studied in naturalistic conditions. Predation 
risk, lighting conditions, seasons, reproduc- 
tion, food availability, temperature, and so- 
cial environment all influence the diurnal 
dynamics and architecture of sleep both 
across species and across time within an in- 
dividual (3-10). 

Libourel et al. implanted electrodes for 
recording electrical activity in the brain 
[electroencephalogram (EEG)] and neck 
muscles (electromyogram) as well as an ac- 
celerometer to capture position informa- 
tion in freely roaming and nesting chinstrap 
penguins in the Antarctic. They also used 
noninvasive sensors to monitor location and 
ambient temperature. Combined with con- 
tinuous video monitoring and direct obser- 
vations over multiple days, the authors were 
able to identify periods of sleep in penguins 
that showed a number of notable peculiari- 
ties. Bouts of sleep were defined primarily as 
transient increases in EEG slow-wave activ- 
ity, occurring both bilaterally and unilater- 
ally. The latter was weakly correlated with 
contralateral eye closure, as has previously 
been described in birds (8). Penguins nesting 
at the periphery of the colony slept in a more 
consolidated manner and engaged in what 
the authors referred to as a deeper sleep. 
Notably, contrary to the conventional under- 
standing of sleep, penguins did not engage in 
prolonged, consolidated periods of sleeping. 
Instead, the birds were observed to nod off 
frequently, accumulating more than 11 hours 
of sleep per day in thousands of brief epochs 
that lasted only 4s on average and are there- 
fore called “microsleeps.” 

The data reported by Libourel et al. could 
be one of the most extreme examples of the 
incremental nature by which the benefits of 
sleep can accrue. Although sleep bout dura- 
tion is sensitive to many variables and differs 
widely among species, the seconds-long mi- 
crosleeps of chinstrap penguins are markedly 
brief. Proving that sleeping in this way comes 
at no cost to the penguin would challenge the 
current interpretation of fragmentation as 
inherently detrimental to sleep quality. 

The findings also provide strong evidence 
for the adaptation of sleep architecture to ex- 
treme environments. Penguins sleep in large, 
noisy colonies, where they are constantly 
bombarded by stimulation from movement 
and sounds produced by coresidents. They 
are also under surveillance by brown skua 
(Stercorarius antarcticus)—predatory birds 
that feed on the eggs and chicks of penguins. 
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Sleep is, by definition, incompatible with the 
vigilance required to protect the nests. If it 
can be determined that penguins are less 
responsive to environmental stimuli during 
microsleeps, as normally occurs during sleep, 
then the constant switching between wake 
and sleep may be a strategy that penguins 
have evolved to balance sleep and vigilance 
requirements. An alternative interpretation 
for this switching is that the birds are expe- 
riencing interrupted sleep—an unintended 
and perhaps unwanted consequence of 
their boisterous social surroundings. Sleep 
within a colony is therefore likely to occur 
in a piecemeal fashion, where, at any given 
moment, some birds are asleep and others 
awake so that the entire colony is always 
half-awake and half-asleep—similar to the 
idea of “corporate vigilance” in doves (11). 

That Libourel et al. observed longer bouts 
and deeper sleep in birds on the periphery 
of the colony argues against the hypothesis 
that locally increased predation risk results 
in less consolidated sleep. This interpreta- 
tion of the data is contingent on a number 
of factors. Although brown skuas are as- 
sumed to be the main threat to the pen- 
guins, some dangers, or simply disturbance, 
might arise from inside the colony—for ex- 
ample, noise or risk of theft of nest material 
by other penguins. It should be added that 
“sleep depth” is a widely used but somewhat 
questionable metaphor because objective 
and subjective measures of sleep depth can 
be dissociated, and the relationship of sleep 
depth with brain activity is not straightfor- 
ward (12). Another interpretation is that 
birds at the periphery do need to be extra 
vigilant, resulting in their awake state being 
of higher “intensity,” which in turn requires 
more-intense recovery sleep (13). 

With the advent of modern techniques 
that dissect the contribution of specific 
brain circuits to sleep oscillations and local 
and global control of sleep, elucidating the 
neurophysiological substrate of sleep has 
become a focus of sleep research. However, 
most of the progress made in this area has 
come from mammals, and knowledge of 
sleep-wake-controlling circuitry in birds 
remains in its infancy. Given the distinctive 
structure of the brain in birds, the basic or- 
ganization of arousal- and sleep-promoting 
networks is likely to be fundamentally dif- 
ferent. Therefore, the frequent alternation 
of wake and sleep observed in chinstrap 
penguins has important implications for the 
understanding of sleep control in general. 
For example, instability of sleep-wake states 
in mice has been linked with malfunction 
of the orexinergic system, such as in nar- 
colepsy (/4), and in humans, conditions 
that fragment sleep, such as sleep apnea, 
may have major consequences for cogni- 


tive function and possibly even precipitate 
the development of neurodegenerative dis- 
ease (15). Thus, what is abnormal in humans 
could be perfectly normal in birds or other 
animals, at least under certain conditions. 

Any observations made regarding sleep 
will reflect not only hard-wired species- 
specific genetically determined features, 
but also the immediate role of the environ- 
ment and other ecological and physiological 
needs. Therefore, not considering context 
will bias interpretation. For example, mi- 
crosleeps in penguins may be an extension 
of previously observed seasonal changes 
in sleep duration among birds (8), present 
only when penguins are nesting and caring 
for their young. The wider and crucially im- 
portant implication of this role of context 
is that data on molecular or neurophysi- 
ological substrates of sleep derived from a , 
typical laboratory experiment may not be 
reproducible if repeated in other, nonstan- 
dard conditions, such as at thermoneutral- 
ity, in a more naturalistic social setting, or 
simply when the animals can choose their 
environment. Studies like that of Libourel 
et al. reveal underappreciated flexibility in : 
sleep phenotypes, which can, in turn, in- 
form traditional laboratory studies. 

Climate change and human activities 
are applying increasing pressure on natu- ~ 
ral habitats, thereby altering ecosystems, 
and light pollution and noise are affecting 
the amount and the quality of sleep in wild 
animals. Therefore, sleep studies in the wild 
are also crucially important in the context 
of conservation. Although there will always 
be ethical and ecological concerns associ- 
ated with research in wild animals, consci- 
entious and well-constructed studies, such 
as that of Libourel et al., are the best way to 
exploit opportunities to study sleep in wild : 
animals free from human influence while it 
is still possible. 
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Filming DNA repair at the atomic level 


Dissection of multistep catalysis by a photoenzyme could inspire green chemistry applications 


By Marten H. Vos 


nzymes catalyze biochemical reac- 

tions at high rates and specificity. The 

structural characterization of transition 

states and intermediates of enzymatic 

catalysis at the atomic level provides 

fundamental understanding and could 
reveal bioengineering possibilities. For most 
enzymes, real-time studies are limited by sub- 
strate-binding kinetics. By contrast, in natu- 
ral photoenzymes, enzyme-substrate com- 
plexes can be preformed and catalysis can 
be initiated by ultrashort light pulses. The 
photoenzyme cyclobutane pyrimidine dimer 
(CPD) DNA photolyase repairs the potentially 
mutagenic DNA CPDs that are induced by ul- 
traviolet light. On pages 1014 and 1015 of this 
issue, Maestre-Reyna et al. (1) and Christou 
et al. (2), respectively, report time-resolved 
crystallographic studies of photolyase, char- 
acterizing a wealth of functional steps span- 
ning picosecond to microsecond timescales, 
including electron transfer, substrate bond 
breaking, and product migration. Beyond 
establishing the order of these events, these 
findings may inspire the engineering of sun- 
light-driven biobased catalysis. 

Natural photoenzymes form stable com- 
plexes with substrate molecules in the dark 
(3). The catalytic transformation of substrates 
to products then occurs upon photon absorp- 
tion by a molecule within this complex. Very 
few natural photoenzymes are known, but by 
using sunlight directly to drive catalysis, they 
are intrinsically “sustainable” bioreactors. 
Moreover, the breadth of photoproducts can 
be expanded by engineering photoenzymes 


for green chemistry photocatalysis or by con- 
structing enzyme variants inspired by photo- 
enzymes. On a more fundamental level, using 
ultrashort light pulses allows visualization of 
photoenzyme catalytic intermediates in real 
time with femtosecond resolution, that is, 
down to the timescale of intrinsic nuclear 
motions. Dissecting such natural catalytic 
pathways is also informative about nonpho- 
tochemical enzymes. 

Three different photoenzymes are known 
to date. Catalysis by light-dependent proto- 
chlorophyllide oxidoreductase (involved in 
chlorophyll biosynthesis) is initiated by the 
excitation of the substrate itself. The two 
other photoenzymes, photolyase and the algal 
enzyme fatty acid photodecarboxylase (FAP), 
both harbor a blue light-absorbing flavin 
adenine dinucleotide (FAD) coenzyme, but 
their photochemical mechanisms are mark- 
edly different. FAD can adopt three different 
redox states (4), and both enzymes produce 
semireduced FAD radical intermediates. But, 
FAP-mediated photoreduction of the oxidized 
flavin leads to oxidation of the nearby sub- 
strate followed by bond breaking, whereas 
photolyase-mediated photooxidation of the 
fully reduced FAD (FADH_) leads to CPD re- 
duction followed by (double) bond breaking 
(5). Of the three enzymes, only photolyase is 
also found in nonphotosynthetic organisms, 
although not in placental mammals. 

The mechanisms of the three photoen- 
zymes have been studied by time-resolved 
optical spectroscopy. However, structural 
snapshots of intermediates with atomic 
resolution have hitherto only been obtained 
in FAP (6). Upon oxidation of the fatty acid 


substrate by the excited flavin, the bond be- 
tween the hydrocarbon chain and the carbon 
dioxide (CO,) moieties breaks. These struc- 
tural studies helped to reveal that this bond 
breaking occurs within hundreds of picosec- 
onds, concomitant with the preceding elec- 
tron transfer, in full agreement with results 
obtained by spectroscopy. 

The method of choice for time-resolved 
structural studies is serial femtosecond 
crystallography (TR-SFX) using x-ray free 
electron lasers (XFELs) (7), which became 
available about a decade ago. Five facilities 
offering TR-SFX are now operational world- 
wide, three of which have contributed to 
the work of Maestre-Reyna et al. and Chris- 
tou et al. XFEL sources provide extremely 
bright and short (sub-10 fs) x-ray pulses that 
allow diffraction from protein microcrys- 
tals before any radiation damage (diffract- 
before-destroy) (8). Using many individual 
microcrystal diffraction patterns enables the 
determination of reliable steady-state struc- 
tures of proteins that are prone to alteration 
by x-ray radiation (/, 6, 9). When combined 
with ultrafast visible pulses, intermediates in 
light-induced processes can be determined. 

Finding appropriate conditions for the 
experiments described by Maestre-Reyna et 
al. and Christou et al. was likely challenging. 
Photolyase is often unstable, and the enzyme 
from the anaerobic archaeon Methanosarcina 
magzet appears most suitable to form photoly- 
ase-DNA complex microcrystals, which must 
be produced at the XFEL facilities under an- 
aerobic and light-shielded conditions. Time- 
resolved spectroscopy on photocatalysis in 
solution has been previously performed on 


Light-induced DNA repair 


Cyclobutane pyrimidine dimer (CPD) DNA photolyase repairs ultraviolet (UV) light-induced CPDs that arise between two adjacent pyrimidine bases. Four temporally distinct 
chemical steps occur that involve the flavin-adenine dinucleotide (FAD) coenzyme of the photolyase and the bases to break the two bonds that form the CPD. 
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other photolyases (JO, 17) but not on the /. 
mazel enzyme, presumably owing to its low 
solubility. It is remarkable that these TR-SFX 
studies are not preceded by time-resolved 
spectroscopic characterization, as had oc- 
curred with other colored proteins. 

The experiments of Maestre-Reyna ez ail. 
and Christou et al. were performed in an in- 
tensity regime of visible light just above the 
level where differences between the struc- 
tures that can be obtained without illumi- 
nation become assignable. Yet this regime is 
far above the linear, physiological regime in 
which less than one photon is absorbed per 
protein. The necessity of working in this re- 
gime, common to nearly all TR-SFX studies 
in proteins, is not fully understood and is 
presently of intense interest (12-14). The au- 
thors argue that, in their respective studies, 
this issue has limited effect because the cata- 
lytic processes are slower than the expected 
nonlinear effects and are in general kinetic 
agreement with linear-regime spectroscopic 
studies in other photolyases. 

The CPD substrate, which consists of a 
pair of doubly cross-linked adjacent thymine 
or cytosine bases, initially bulges from the 
damaged DNA strand into the enzyme. Al- 
though the emphasis of the findings differs, 
both studies come up with a coherent picture 
of the order and timing of events leading to 
the restoration of the intact DNA double he- 
lix (see the figure). Upon population of the 
FADH * excited state, four chemical reactions 
are distinguished within the ~10-" to 10° s 
time span. Flavin-to-CPD electron transfer 
occurs, and this reduction breaks the two 
CPD cross-links. In the MZ. mazei photolyase, 
the CPD cross-links were found to break one 
after the other with sizeable delay, both upon 
reduction and between the bond ruptures. 
Subsequently, the electron returns to the fla- 
vin. These events occurred nonconcomitantly, 
which allowed the structural reorganizations 
associated with the subsequent processes to 
be distinguished. Whether the two bonds 
break simultaneously or sequentially has 
been intensely discussed on the basis of spec- 
troscopy and quantum yields (QYs) (the num- 
ber of products formed per absorbed photon; 
subunity QY arises from back reactions) for 
other photolyases (10, 17). The results of Mae- 
stre-Reyna et al. and Christou e¢ al. settle the 
issue for the M. mazei enzyme; yet given the 
close timescales, and the enzyme-dependent 
QYs, this could vary between photolyases. 

Photolyase catalysis in the chemical sense 
is completed within a few nanoseconds. Two 
distinct steps occur on a much slower time- 
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scale (>10~’ s) and correspond to sequential 
migration out of the active site and partial 
disordering of the two DNA bases. As pointed 
out in both studies, this trajectory may be in- 
fluenced by the crystalline rather than solu- 
tion environment, hindering full restoration 
of the repaired DNA double-helix structure. 

Christou et al. reported a striking “flip” of 
the three-ring flavin chromophore along the 
butterfly bend angle at 3 ps, which is sug- 
gested to functionally stabilize the FADH~* 
excited state. At such short times, the heat 
from multiple-photon absorption is not dis- 
sipated, and future studies should investigate 
this feature under linear excitation condi- 
tions. A related open question is the origin of 
the subunity QY, reported by Maestre-Reyna 
et al. at ~0.25, which is lower than that for 
other photolyases. A corresponding strong 
decay of the product states was not reported 
in either of the two studies. This suggests 
that wasteful back reactions compete with 
the primary catalytic electron transfer on the 
picosecond timescale, an issue that can be 
addressed by time-resolved spectroscopy. 

Together, the studies of Maestre-Reyna 
et al. and Christou et al. provide detailed 
structural insights into a complex catalytic 
mechanism and, at the same time, open up 
important new avenues for structural, spec- 
troscopic, and extended quantum computa- 
tional investigations to map out the entire 
reaction pathway and energetics based on the 
observations. Moreover, they are expected to 
promote further development of engineered 
flavin-dependent light-driven biocatalysts in- 
spired by photolyase photochemistry that is 
initiated by the excitation of reduced flavin. 
These green chemistry systems catalyze a 
rapidly expanding range of reactions, includ- 
ing highly specific radical cyclizations and 
polymerizations (15), but are still far-less ef- 
ficient than native systems (4). 
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GENOMICS 


Mutation 
hotspots 
during meiosis 


Multiple pathways generate 


mutations at sites of meiotic 
recombination in humans 


By Frédéric Baudat and Bernard de Massy 


enome integrity should be under 
high surveillance, particularly in 
germ cells. Yet, it is estimated that at 
each generation in humans, an aver- 
age of 61 new mutations occur in the 
germ line (7). These de novo muta- 
tions (DNMs) can have deleterious con- 
sequences but are also a source of genetic 
diversity and adaptability. On page 1012 of 
this issue, Hinch et al. (2) investigated the 
occurrence of mutations during meiotic re- 
combination in the human genome. Meiotic 
recombination is a genome-wide program 
of exchanges between the paternal and ma- 
ternal chromosomes that occurs in germ 
cells (3). They found that the mutation rate 
at meiotic recombination sites was several 
hundred-fold higher than the genome aver- 
age. Some mutations were products of the 
meiotic recombination pathway, and others 
resulted from other DNA repair pathways. 
These mutations are transmitted to the off- 
spring and can be linked to pathologies. 

In germ cells, as in other cell types, DNA 
must be faithfully copied at each cycle of 
replication and can be damaged by exog- 
enous or endogenous insults (4). During 
DNA replication, the proof-reading activity 
of DNA polymerases minimizes the incorpo- 
ration of incorrect nucleotides. However, a 
low level of errors by DNA polymerase re- 
main uncorrected and at DNA lesions im- 
proper repair can occur, both of which lead 
to mutations [for example, single-nucleo- 
tide polymorphisms (SNPs), insertions and 
deletions (indels)]. 

Germ cells undergo meiosis to give rise 
to gametes. During meiosis, the genome 
faces an additional challenge in the form of 
programmed induction of several hundred 
DNA double-strand breaks (DSBs) that are 
repaired by homologous recombination, 


Institut de Génétique Humaine, Université de Montpellier, 
Centre National de la Recherche Scientifique, Montpellier, 
France. Email:bernard.de-massy@igh.cnrs.fr 


1 DECEMBER 2023 + VOL 382 ISSUE 6674 997 


id 


INSIGHTS | PERSPECTIVES 


resulting in exchange of genetic in- 
formation between the homologous 
chromosomes (3). These meiotic 
recombination events involve DSB 


Mutations during meiotic recombination 
During meiotic recombination, which occurs in germ cells before 
they undergo the first meiotic division, DNA double-strand breaks 
(DSBs) form, and errors during their repair can result in mutations 


meiotic recombination represented 
only 0.5% of the total number of 
SNPs and a few percent of the to- 
tal number of indels in germ cells, 


formation, resection of the 5’ ends of 


the DSB resulting in single-stranded _ deletions (indels 
recombination (H 
can be altered owing to instability of the sing! 
tails or the activity of the APOBEC DNA editin 
HR-mediated DNA repair is thought to involve error-prone DNA 
polymerases such as REV1 and DNA polymerase 1n (POLH). Some 

DSBs not repaired by HR are repaired by nonhomologous end-joining 
(X chromosome) or theta-mediated end-joining (autosomes), both of 
which can lead to indels. 


DNA (ssDNA) tails, search for a ho- 
mologous DNA sequence, invasion of 
the homologous DNA, repair synthe- 
sis by DNA polymerases, and further 
processing steps to generate intact 
double-stranded DNA. However, this 
DNA repair can result in errors. In 
yeast, meiotic recombination leads 
to a high mutation rate (5). Genome- 
wide studies also detected mutations 
at the sites of meiotic recombination 
in humans (J, 6). 

Hinch et al, explored genetic data 
from 2976 mother-father-child trios, 
refining an analysis from a previous 
study (1). They also analyzed 70,000 
genomes from the gnomAD data- 
base, in which DNMs were inferred 
from the presence of extremely rare 
allelic variants. The authors searched 
for mutations in the genomic inter- 
vals surrounding meiotic DSB sites 
(recombination hotspots) that have 
been mapped (6) and correspond to 
binding sites for the histone-lysine- 
N-methyltransferase PRDM9 (6, 
7). They discovered several types 
of mutations caused by distinct 
mechanisms. 

Interestingly, a subset of SNPs 
were characterized by their asym- 
metric distribution on both sides of the 
DSB, in line with previous reports (J, 6). 
These SNPs result from substitutions (re- 
placement of a nucleotide by another), 
which are thought to be the outcome of 
the normal homologous recombination 
pathway. Specifically, their pattern revealed 
that they resulted from the instability of 
the ssDNA that is generated upon DSB end 
processing or from DNA polymerase errors. 
On the basis of the nucleotide changes, the 
authors predicted that the ssDNA at meiotic 
DSBs is altered by a member of the APOBEC 
cytidine deaminase family (8), and that the 
repair synthesis is mediated by the error- 
prone translesion DNA polymerases REV1 
and DNA polymerase 7 (POLH) (9) (see the 
figure). 

Hinch et al. also observed SNPs several 
kilobase pairs away from recombination 
hotspots, which suggests that repair some- 
times occurs far from the DSBs through 
a molecular DNA repair pathway called 
break-induced replication. The presence 
of other types of mutations, such as indels 
and structural variants, indicates a failure 
to restore the allelic copy by homologous 
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such as single-nucleotide polymorphisms (S 
. The majority of DSBs are repaired by homologous 
R), during which DNA sequences flanking the DSB 
e-stranded DNA (ssDNA) 
g enzymes. In addition, 
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recombination. This can be explained by a 
modification of the homologous recombi- 
nation pathway or the use of alternative re- 
pair pathways. The analysis of indels sug- 
gested DSB repair by the DNA polymerase 
6§-mediated end-joining pathway (10). 

Hinch et al. also uncovered information 
about factors that influence mutation rate 
and type. The differences in the types of 
some mutations between autosomes and 
the X chromosome suggest that the pairing 
status of the X chromosome (unpaired in 
males) affects the recombination pathway. 
Specifically, the identified mutations sug- 
gest a more frequent use of the nonhomolo- 
gous end-joining (NHEJ) repair pathway on 
the X chromosome compared to autosomes 
in male meiosis. As the NHEJ pathway is 
thought to be more active in late meiotic 
prophase, a delay in DSB repair on the X 
chromosome may allow increased NHEJ. 
The rate of mutations related to meiosis 
was also threefold higher in males than fe- 
males. Whether this is related to the higher 
global germline mutation rate in males 
than females is not known (11). 

Although the mutations resulting from 


Ps) and insertions and 


Minority of repair events 


On the X On 
autosomes 


Inde 


they have distinct properties that 
are highly relevant to genome in- 
tegrity and genetic diversity. For 
example, they occur at recombi- 
nation hotspots, which constitute 
a very small portion (1 to 2%) of 
the genome, and their rate is ex- 
tremely high locally. Recombination 
hotspots are therefore at very high 
risk of genomic instability. These 
hotspots can be within or outside 
genes and are a potential source of 
deleterious mutations, as previously 
reported (12). Hinch e¢ al. identified 
several diseases caused by muta- 
tions that might result from meiotic 
recombination at hotspots, for ex- 
ample, mutations in trimethyllysine 
hydroxylase « (TMLHE) that cause 
X-linked autism. As the locations 
of the PRDM9-dependent hotspots 
can differ between individuals and 
change during the evolution of the 
human genome (J3), the locations 
of mutations resulting from meiotic 
recombination are thought to cover 
a larger set of genomic regions than 
the current hotspots. 

Genome stability is also crucial for 
processes such as aging and tumori- 
genesis (/4). The study by Hinch e¢ al. 
leads to predictions about the molec- 
ular mechanisms of the mutagenic 
activity ensuing from meiotic recombina- 
tion that will be exciting to test and that 
are important for understanding the main- 
tenance of genome stability in general. 
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ASTRONOMY 


A low-mass star with a large-mass planet 


A large planet orbiting a very low-mass star challenges theories of planet formation 


By Frédéric Masset 


lanets form in systems of gas and 

dust that surround newborn stars. 

These protoplanetary disks are reser- 

voirs for material that will eventually 

become planets (7). The more massive 

is the disk, the more massive are the 
planets that can form out of it. The mass of 
dust contained in the disk scales steeply with 
the star mass (2). It is therefore unsurprising 
that planets found around M dwarfs—the 
least massive stars among the spectral clas- 
sification of stars—tend to have lower masses 
than in other systems. Yet on page 1031 of this 
issue, Stefansson et al. (3) report the discov- 
ery of a planet with a mass of at least 13.2 
Earth masses on a close orbit around a very 
low-mass star (LHS 3154), which is unheard 
of for a star that has a mass only 0.11 times 
that of the Sun. 

Among the known planets orbiting very 
low-mass stars with an orbital period of 
less than 10 days, the planet discovered by 
Stefansson et al., called LHS 3154b, is an out- 
lier: Its minimum mass ratio with the star is 
3.5 x 10“. This is more than twice the largest 
such ratios previously known. This extraor- 
dinary property led the authors to perform 
simulations of planet formation to check 
whether a planet such as LHS 3154b could 
emerge from a protoplanetary disk similar to 
those detected around very low-mass stars. 

Planets can form either through the frag- 
mentation of a protoplanetary disk under 
its own gravity, yielding large clumps that 
eventually become massive planets, or from 
gradual accretion of solids onto planetary 
embryos, possibly followed by the accretion 
of gas. The former scenario is highly unlikely 
to occur at the short distance at which LHS 
3154b orbits its star. It also would produce 
planets substantially more massive than LHS 
3154b (4), such as those orbiting the very low- 
mass star GJ 3512 (5). Because the mass of 
LHS 3154b and its orbital distance seem to 
rule out disk instability as a viable forma- 
tion mechanism, the authors considered the 
second scenario, called core accretion. Their 
simulations started with many planetary em- 
bryos 1000 km in diameter, scattered across 
the disk, which initially grow through mutual 
collisions and by accreting solids. This accre- 
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tion can unfold either from a swarm 
of asteroid-sized bodies called plan- 
etesimals or from much smaller 
bodies called pebbles, typically with 
a size in the centimeter-to-meter 
range. From previous work (6), 
Stefansson et al. noted that more 
massive planets are obtained in 
close-in orbits around M dwarfs 
from planetesimal accretion, and 
the authors specifically focused 
on this mode of growth. In addi- 
tion to accreting solids, embryos start 
accreting gas when their masses exceed a 
threshold on the order of a few Earth masses. 
As embryos grow larger, they gravitation- 
ally interact with the disk around them, 
which changes the size of their orbits. Over 
time, these orbits generally shrink, gradually 
drawing the embryos closer to the star (7). 
This mix of growth and orbital decay ceases 
once the disk dissipates, which occurs on a 
time-scale of 1 million to 10 million years. 

Crucial to the final outcome is the mass 
of solids initially present in the protoplan- 
etary disk, as well as their distribution with 
respect to the distance to the star. Because 
these properties vary widely among disks, 
Stefansson et al. performed many simula- 
tions, with diverse solid contents that match 
the statistics obtained from observations of 
disks around very low-mass protostars in 
the Chameleon star-forming region (2). Out 
of 300 simulations, no planet larger than 10 
Earth masses and with an orbital period less 
than 10 days was identified. However, repeat- 
ing the simulations with 10 times more solids 
sometimes produced planets with such prop- 
erties. Changing the initial disk to have a 
more compact dust distribution made it even 
more likely that these planets would form, 
but simply making the distribution more 
compact without adding more solid mass did 
not yield any such planet. 

The absence of a LHS 3154b analog in sim- 
ulations using solid mass data from observa- 
tions suggests that these masses are simply 
not enough. The masses of solids inferred 
from observations of protoplanetary disks 
are rarely sufficient to produce known 
planetary systems (8, 9). Several expla- 
nations might explain this discrep- 
ancy. It could be that planets form 
very early during the disk phase 
(0); thus, we would essentially 
observe the leftovers of planetary 
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formation. Taking into account se- 
lection and observational biases 
in the exoplanet and disk samples, 
and assuming a 100% efficiency of 
planet formation, the discrepancy 
disappears, at least for Sun-like 
stars (71), but so far no mechanism 
has been found that converts all 
the solid mass into planets, even 
if planet formation in dust rings 
could be efficient (72). Another ex- 
planation is that we do not see all 
the dust. Dust masses in well-studied 
nearby protoplanetary disks could be almost 
an order of magnitude larger than previously 
thought (73, 14). It is plausible that the com- 
plete solution to this discrepancy involves a 
mix of these different explanations. The re- 
sults of Stefansson et al. suggest that a sub- 
stantial increase in the mass of solids could 
be an important part of that solution. 


REFERENCES AND NOTES 


1. J.Drazkowskaet al., ASP Conf Ser. 534, 717 (2022). 
2. |.Pascuccietal., Astrophys. J. 831,125 (2016). 
3. G.Stefanssonetal., Science 382, 1031 (2023). 
4. A.Boss,S.Kanodia, Astrophys. J.956, 4 (2023). 
5. J.C.Morales etal., Science 365, 1441 (2019). 
6. Y.Migueletal., Mon. Not. R. Astron. Soc.491, 1998 (2020). 
7. S.J.Paardekooper et al., ASP Conf Ser. 534, 685 (2022). 
8. C.F.Manara,A. Morbidelli, T. Guillot, Astron. Astrophys. 
618,L3 (2018). 
9. S.M.Andrews, Annu. Rev. Astron. Astrophys. 58,483 
(2020). 
10. J.Najita, S.J. Kenyon, Mon. Not. R.Astron. Soc.445, 3315 
(2014). 
ll. G.D.Mulders, |. Pascucci, F. J. Ciesla, R. B. Fernandes, 
Astrophys. J.920, 66 (2021). 
12. H.Jiangetal.,Mon. Not. R.Astron. Soc. 518, 3877 (2022). 
13. Z.Xin,C.C. Espaillat, A.M. Rilinger, A. Ribas, E. Macias, 
Astrophys. J.942, 4 (2023). 
14. Y.Liuetal.,Astron. Astrophys. 668, 175 (2022). 


10.1126/science.ad|3365 


Earth 


INSIGHTS | PERSPECTIVES 


RETROSPECTIVE 


Evelyn Fox Keller (1936-2023) 


Physicist and feminist scholar of science 


By Angela N. H. Creager 


velyn Fox Keller, scientist, feminist 

scholar, and author of influential pub- 

lications on genetics, developmental 

biology, and scientific language, died on 

22 September. She was 87. After train- 

ing in physics and working in mathe- 
matical biology, Evelyn turned her attention 
to understanding how societal constructs, 
especially gender, guide science. She brought 
feminist insights into the history and philos- 
ophy of biology and sparked broader inter- 
disciplinary conversations about the role of 
metaphor and rhetoric in science. 

Born in New York City on 20 March 1936 
to Russian Jewish immigrant parents, Evelyn 
was the youngest of three children. Her sib- 
lings, biologist Maurice Fox and _ political 
scholar-activist Frances Fox Piven, brought 
Evelyn into contact with prominent intellec- 
tuals. At age 16, she was contacted by physi- 
cist Leo Szilard in his restless search for scien- 
tific talent. In 1957, she received a bachelor’s 
degree in physics from Brandeis University, 
under the mentorship of Silvan Schweber. 
Evelyn went on to graduate school at 
Harvard University, aiming to become a theo- 
retical physicist, but found herself treated as 
an oddity there. She felt profoundly isolated 
as a woman in physics, surrounded by a “sea 
of seats” even in a full classroom. Spending 
time at Cold Spring Harbor Laboratory, she 
saw a path forward in molecular biology. 
Evelyn began conducting experiments on 
bacteriophages, first with Frank Stahl and 
then with Matthew Meselson, and produced 
a dissertation formally advised by Walter 
Gilbert (then appointed in physics). After 
earning her PhD from Harvard in 1963, she 
began teaching physics at Cornell University 
Medical College and working as a research 
assistant for Joseph B. Keller, whom she mar- 
ried in 1963. She then began a productive col- 
laboration with applied mathematician Lee 
Segel on slime mold aggregation. Although 
their joint papers were widely cited, Evelyn 
had difficulty finding an academic position. 

In 1972, a new interdisciplinary branch of 
the State University of New York in Purchase 
provided Evelyn with what she has called 
“an intellectual and professional harbor.” 
She was one of several academic feminists 


Department of History, Princeton University, Princeton, NJ, 
USA. Email: creager@princeton.edu 


1000 _ 1 DECEMBER 2023 * VOL 382 ISSUE 6674 


on that campus. Evelyn taught courses on 
women and science and started paying atten- 
tion to how scientists drew on gender ideol- 
ogy, which resulted in privileging attributes, 
such as objectivity, that were associated with 
masculinity, while devaluing those, such as 
empathy, associated with femininity. 

Evelyn’s widely read biography of Barbara 
McClintock, A Feeling for the Organism, drew 
on these feminist insights and emphasized 
how McClintock was a maverick, one who de- 
nied that her sex accounted for her originality. 
The book was published in 1983; McClintock 


received the Nobel Prize in Physiology or 
Medicine the same year, heightening the bi- 
ography’s impact. Many readers misunder- 
stood Evelyn to be saying that women do 
science differently than men, inferring that 
she promoted a “feminine” science. Evelyn 
strove to clarify that masculinity and femi- 
ninity were cultural constructs, not biologi- 
cal realities. Her second book, Reflections on 
Gender and Science, published in 1985, honed 
her feminist arguments by applying them to 
examples from the history of science, from 
the masculinist language of Robert Boyle 
in early modern England to the nonhierar- 
chical thinking behind her own slime mold 
research. Addressing gender biases, she be- 
lieved, could lead to better science. 

Evelyn’s growing prominence brought new 
recognition and opportunity. After several 
years of teaching at Northeastern University, 
in 1988 she was appointed professor in the 
Department of Rhetoric at University of 
California, Berkeley. She was interviewed by 
Bill Moyers in 1990 for his PBS show “A World 
of Ideas” and explained to a broader audience 
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and by gender norms. In 1991, she recerveu 
an honorary doctorate, the first of many, 
from Mount Holyoke College. The following 
year she was named a MacArthur Fellow 
and hired by the Massachusetts Institute 
of Technology (MIT) to join the faculty of 
its Program in Science, Technology, and 
Society. She remained at MIT, as profes- 
sor emerita, until her death. Retaining her 
keen sensitivity to gender, Evelyn turned to 
broader questions about how conceptions 
of life and heredity had changed over time. 

In her books Refiguring Life (1995), The 
Century of the Gene (2000), and Making 
Sense of Life (2002), Evelyn focused on how 
geneticists discussed causation. American 
geneticists in the early- to mid-20th cen- 
tury often described phenotypes in terms 
of “gene action.” Molecular biologists of the 
1950s through the 1970s adopted this same 
idiom, reinforcing the activity of genes. Yet, 
as molecular biologists moved from work- 
ing on bacteria and viruses to flies, frogs, 
and mice, and as they tackled older prob- 
lems such as tissue differentiation, their lan- 
guage changed. Eric Davidson’s 1968 book 
Gene Activation in Early Development, she 
noted, registered their shift in terminology 
from gene action to gene activation. The re- 
wording exposed the fundamental circular- 
ity of genetic notions of agency—where are 
the agents that control the genes, so that 
the genes can control development? Evelyn 
argued that newer metaphors, many based 
on comparing the organism to the computer, 
made way for an understanding of distrib- 
uted agency and networks of action rather 
than simple causation. 

In the 2000s, Evelyn turned her atten- 
tion to the crisis of climate change and to 
communicating with a skeptical public. In 
2017, she and Philip Kitcher published The 
Seasons Alter: How to Save Our Planet in Six 
Acts. It was one of many collaborations— 
with scientists, feminist scholars, philoso- 
phers, and historians—that she undertook 
over her career. 

I got to know Evelyn during her early years 
at MIT, when I was a postdoctoral fellow. We 
met regularly to discuss scientific papers 
from the 1950s and 1960s. How biologists 
used language, she showed me, revealed how 
they understood life. I came to see that for 
all of her acclaim, she was never entirely at 
home in the world of academic disciplines. 
She always saw her writing about science 
as an extension of her work as a scientist. 

A habitual boundary-crosser, Evelyn 
worked to open up the fields she traversed 
to other participants and other voices. Her 
death deprives us of a brilliant critic and 
commentator on the life sciences. & 
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The costs of “costless” climate mitigation 


The IPCC and leading economic models have different ideas about emissions reduction costs 


By Matthew J. Kotchen!, James A. Rising’, 
Gernot Wagner? 


ow much will it cost to meaningfully 

reduce greenhouse gas (GHG) emis- 

sions on a global scale? The answer 

is critical for assessments of how to 

address climate change—affecting 

public support, political will, and 
policy choices. We find that the “bottom- 
up” estimation approach emphasized by the 
United Nations Intergovernmental Panel 
on Climate Change (IPCC) reports consid- 
erably lower costs for emission reductions 
than leading “top-down” economic models. 
We also find that one core feature explains 
the vast majority of the difference: The 
bottom-up estimates include substantial 
reductions that appear to come at zero 
cost, or even at a savings, whereas the eco- 
nomic models assume no such “free lunch.” 
The fact that different methodological ap- 
proaches produce different results may not 
be surprising. But that nearly all of the 
discrepancy loads on how much mitiga- 
tion is seemingly costless raises important 
challenges for understanding and com- 
municating the actual costs of reducing 
emissions. 

We compare two of the leading ap- 
proaches for estimating marginal mitiga- 
tion costs: the bottom-up, sector-by-sector 
approach employed by the IPCC and oth- 
ers (J, 2), and the top-down approach built 
into leading integrated assessment models 
(IAMs) in the climate-economics literature 
focused on benefit-cost analyses (3-5). 
Research on the “energy efficiency para- 
dox” foreshadows how results may differ 
between approaches (6, 7). The paradox 
arises because adoption rates of energy 
efficiency investments typically fall short 
of predictions based on the cost savings 
estimated by bottom-up, engineering ap- 
proaches. The reason is that a full account- 
ing of direct and indirect costs is not taken 
into consideration. This might suggest that 
the IPCC and other bottom-up approaches 
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understate the full costs of reducing GHG 
emissions. But there may also be concerns 
with top-down estimates, as broad analy- 
ses of environmental and climate policies 
find that costs are likely overstated (8-10). 
We therefore find scope for a potential mid- 
dle ground. 


A COMPARATIVE COST ANALYSIS 

We focus primarily on the IPCC’s headline 
costs reported as mitigation potentials in 
the Summary for Policymakers of Working 
Group III in the Sixth Assessment Report 
(1). It is upon these estimates that the IPCC 
asserts with “high confidence” that global 
GHG emissions could be reduced by at least 
half in 2030 at costs less than $100 per 
tonne of CO, equivalent (tCO,-eq). For these 
estimates, the IPCC, as McKinsey & Com- 
pany famously did over a decade earlier (2), 


estimated the cost of abatement opportuni- 
ties bottom-up, sector by sector, and tech- 
nology by technology. 

The IPCC provides estimates of the 2030 
mitigation potential at a range of costs 
for 43 different activities (e.g., renewable- 
energy installations, alternative land 
management strategies, energy-efficient 
buildings, fuel switching) in six sectors: en- 
ergy; agriculture, forestry, and other land 
uses (AFOLU); buildings; transport; indus- 
try; and other. The analysis is based on a 
comparison of net lifetime monetary (“out 
of pocket”) costs of avoided emissions be- 
tween particular mitigation activities and a 
reference technology, where the estimates 
are based on 175 different sources, many of 
which are regionally specific. For example, 
electric vehicles are compared to specific 
internal combustion engine vehicles on the 


Comparison of global mitigation potentials at different costs 

The IPCC results use different baseline emissions to calculate the range of mitigation potentials. The top panel 
reports the full set of results, and the bottom panel reports only the mitigation potentials with costs >$0 per 
tonne of CO, equivalent (tCO,-eq). USD reported in 2020 dollars. See supplementary materials. 
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basis of estimated manufac- 
turing and operating costs, 
changes in emissions, and as- 
sumed adoption rates. 

The IPCC results can be 
used to plot an aggregate 
marginal cost curve for miti- 15 
gation (see the first figure, 
top panel). The marginal cost 
per tCO,-eq is increasing in 
the level of aggregate emis- 
sion reductions, showing that 
obtaining greater reductions 
requires more higher-cost ac- 
tivities. Across all sectors, the 
IPCC estimates the potential 
for a 32% reduction in GHG 
emissions in 2030 for activi- 
ties costing $20 or less per 
tCO,-eq; that potential in- 
creases to 43% at costs up to 
$50. These findings are not de- 
pendent on imposing a policy 
requiring payment for emis- 
sions, such as a carbon tax. In- 
stead, they focus on the technical potential 
and, for example, imply that a 32% reduc- 
tion in emissions would occur if all activi- 
ties costing less than $20 per tonne were to 
take place. 

The IPCC reports potentials for five dif- 
ferent cost bins, including one where costs 
are <$0, which turns costs into savings, at 
least with respect to the needed monetary 
expenditures. A saving rather than cost oc- 
curs when, for example, the full cost of elec- 
tric vehicles (including maintenance and 
operating costs) is estimated to be below 
that of conventional substitutes, and the 
former is associated with lower emissions. 

We also consider results of the McKin- 
sey cost curve, a useful benchmark because 
it is so well-known, frequently invoked to 
justify policy, and a methodologically simi- 
lar point of reference for the IPCC results. 
The 2030 mitigation potentials, according 
to McKinsey’s 2009 estimates, are greater 
than the IPCC’s 2022 estimates at all costs 
>$0, but both provide nearly identical esti- 
mates that a 16% reduction in GHG emis- 
sions in 2030 would result in net monetary 
savings for the global economy (see the 
first figure, top panel). 

However, there are nonmonetary barri- 
ers to the adoption of mitigation activities. 
For example, even if an electric car is less 
expensive than a conventional alternative, 
some people might prefer the alternative 
for a host of reasons, including performance 
and familiarity. In these cases, switching to 
the electric vehicle may no longer be con- 
sidered a cost savings after accounting for 
these preferences and trade-offs. In general, 
barriers to adoption of mitigation and en- 
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AFOLU 


AFOLU, IPCC, Intergovernmental Panel on Climate Change, agriculture, forestry, and other land use. 


IPCC mitigation potentials in 2030 by sector 

Mitigation potential is reported separately for different cost bins. “Other” includes reduced 
emissions from fluorinated gas and methane from solid waste and wastewater. The sum of 
these potentials for each bin corresponds with the IPCC point estimates in the first figure. 
USD reported in 2020 dollars. See supplementary materials. 
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ergy-efficient activities include lack of infor- 
mation and capital for higher upfront costs, 
societal costs and organizational barriers 
such as the provision of charging stations, 
uncertainty around a new technology, and 
behavioral-based rigidities. So how much, 
then, would it cost to get more people to 
switch? Taking account of the direct ex- 
penses alone is incomplete, but how to 
include the indirect costs in the account- 
ing—i.e., the full opportunity costs—is less 
clear and often highly individual specific. 

Although the IPCC acknowledges non- 
monetary barriers to the adoption of miti- 
gation activities, they are not taken into 
account in its bottom-up (and bottom-line) 
estimates, nor is their omission mentioned 
in the high-confidence recommendation of 
what can be accomplished by 2030. Part 
of the reason is likely that nonmonetary 
barriers are more difficult to quantify, re- 
sulting in less literature on which to draw. 
Another reason may be to emphasize the 
potential cost savings, hoping to change 
preferences and behaviors that create bar- 
riers in the first place, thereby lowering 
their cost. A potential downside of focus- 
ing only on direct monetary costs, how- 
ever, is misleading communication about 
the full extent of how costly emissions re- 
ductions will be. 

In contrast to bottom-up, engineering- 
focused, models, the benefit-cost IAMs 
take a top-down approach, representing 
abatement costs as an aggregate function 
of emission reductions or carbon prices. 
Because the models capture aggregate 
global trends, abatement costs are not 
specified for particular activities. Instead, 


Transport 


they capture trade-offs 
with how abatement ac- 
tivities decrease funding 
available for other in- 
vestments that promote 
economic growth, while 
also accounting for the 
benefits of technological 
learning. For example, a 
policy that promotes the 
manufacturing of elec- 
tric vehicles would mean 
fewer resources for other 
productive investments, 
while also spurring learn- 
ing and spillovers across 
regions that lower the 
cost of future emissions 
reductions. The net ef- 
fect of these different 
impacts, which vary by 
model, contributes to the 
marginal costs of reduc- 
ing emissions. All of the 
models, however, assume 
that mitigation activities must be costly, 
otherwise they would already occur with- 
out the need for policy and be included in 
business-as-usual baselines. 

We focus primarily on the three eco- 
nomic IAMs used around the world and 
traditionally employed by the US govern- 
ment (and other countries) to derive offi- 
cial estimates of climate damages: DICE, 
FUND, and PAGE (3-5). For each, we are 
able to produce model-based marginal cost 
curves for GHG emission reductions in 
2030 (see supplementary materials). These 
results can be interpreted equivalently 
as mitigation potentials (i.e., the percent 
reduction in GHG emissions compared 
to a baseline) at different costs and com- 
pared directly to those from the bottom-up 
approaches. 

The estimated mitigation potentials of all 
three IAMs are lower than the bottom-up 
estimates (see the first figure, top panel), 
especially below costs of $200 per tCO,-eq. 
For example, at costs below $50, the mean 
estimate among the IAMs is a 26% reduc- 
tion in 2030 compared to 43% for the IPCC. 
Nevertheless, all three IAM estimates fall 
within the range of uncertainty based on 
the IPCC potentials at $200, which is the 
only level at which uncertainty is reported. 

A striking observation is that most of 
the divergence between approaches is the 
result of what happens at costs <$0. None 
of the climate-economic IAMs include any 
costless mitigation, in contrast to the IPCC 
and McKinsey curves (see the first figure, 
top panel). A fundamental insight is that 
the two approaches send very different 
messages about the likely costs of emis- 
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sions reductions less than a decade into 
the future. Specifically, the IPCC estimates 
show mitigation potentials approximately 
twice the magnitude of the mean across 
the IAMs at costs ranging from $20 to $50. 

But when we consider only emission re- 
ductions that occur at costs >$0, removing 
the 16% of costless reductions in the IPCC 
and McKinsey analyses, the top-down and 
bottom-up estimates closely match (see 
the first figure, bottom panel). When thus 
given the same starting point, the mean 
mitigation potential across the three IAMs 
differs from the IPCC estimate by less than 
2 percentage points at costs ranging from 
$20 to $50. 

Finally, we consider comparisons to six 
other IAMs often called “process-based” mod- 
els because they focus more on bottom-up in- 
tegration of energy and biophysical systems 
rather than on benefit-cost analysis (7-13). 
The models that we consider are also used 
in different portions of the IPCC analysis. 
We find that these models are more closely 
aligned with the bottom-up estimates than 
the benefit-cost IAMs with regard to mitiga- 
tion potentials and the treatment of costless 
mitigation (see supplementary materials). 
This finding is important because whereas 
the IPCC’s bottom-up analysis acknowledges 
omission of indirect, nonmonetary costs, the 
same qualification is typically not associated 
with policy analysis coming out of the pro- 
cess-based models. 


RESEARCH NEEDS 

We have shown how different starting 
points on costless mitigation explain the 
vast majority of the divergence in the IPCC’s 
bottom-up estimates and top-down eco- 
nomic approaches. But the critical question 
is: Which starting point is more helpful for 
understanding how costly it will be for so- 
cieties to reduce emissions, and which poli- 
cies will be most effective in prompting these 
changes? 

By holding to the position that there’s no 
such thing as a free lunch, economists may 
be overestimating the true cost of mitigation. 
This could occur in top-down models be- 
cause of failure to recognize that inefficient 
decision-making exists for individuals, in- 
dustry, and governments, meaning that there 
are opportunities to reduce GHG emissions 
while lowering costs—a so-called “win-win.” 
The question is how pervasive are such op- 
portunities. Whereas the economics IAMs 
assume there are none, the IPCC asserts 
that win-wins can account for up to 16% of 
emission reductions in 2030. But, as we have 
shown, the IPCC estimate omits a whole cat- 
egory of difficult-to-quantify nonmonetary 
costs. Given this wide range, we argue that 
more explicit communication and empirical 
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evidence is necessary to provide greater clar- 
ity and resolution on the policy-relevant costs 
of mitigation. 

To help set the research agenda, we pro- 
vide a breakdown of sectors where the is- 
sue appears most important (see the second 
figure). In the four sectors where the IPCC 
reports costless mitigation, its magnitude 
is often a substantial portion of the full po- 
tential reported at any cost. This suggests 
that, in many cases, resolving the question 
of how much mitigation is costless is more 
important than refining estimates of the 
mitigation potential at costs >$0. This is 
particularly true in the energy, buildings, 
and transportation sectors. 

We see several opportunities to move for- 
ward. First, as an institutional matter, the 
engagement of economists with the IPCC has 
been in decline (/4), and efforts to promote 
greater involvement will help. The need for 
greater engagement goes both ways: climate 
scientists wanting to engage with econo- 
mists, and vice versa. Economics is focused 
on the study of trade-offs, and the questions 
being raised here are about how best to quan- 
tify and communicate the trade-offs associ- 
ated with reducing GHG emissions. Our ex- 
perience is that some climate scientists rarely 
appreciate reminders of trade-offs when it 
comes to a problem rightfully viewed with 
an increasing sense of urgency. Nevertheless, 
without a clear-eyed view about the full costs 
(i.e., trade-offs) of addressing climate change, 
the community of researchers runs the risk of 
miscommunicating the scale of the challenge 
ahead and of drifting further away from the 
political realities of policy formulation and 
implementation. 

Second, the literature on barriers to the 
adoption of mitigation and energy effi- 
ciency activities can be reoriented to play a 
more constructive role. Studies of particu- 
lar programs aimed at reducing emissions 
often find actual costs that are substan- 
tially greater than anticipated (6, 7). What 
is needed is a greater sense for what types 
of activities and in what sectors the gaps are 
likely to be small or large, and which ones 
might be especially important for scaling up 
to estimates on a global scale. Additionally, 
beyond documenting the gaps, more behav- 
ioral and social science research is needed 
on how they can be overcome and help in- 
form real-world policies. 

Third, research is needed with an eye 
toward reconciling seemingly conflicting 
results. Whereas studies of particular inter- 
ventions or projects often find higher costs 
than expected according to bottom-up ap- 
proaches, the opposite is true for many 
analyses of more general environmental and 
climate policies, for which realized costs are 
considerably less than expectations based on 


top-down approaches (8-10). Why? And what 
lessons can be learned for estimates of global 
marginal abatement costs? A plausible inter- 
pretation, suggesting a middle ground, is that 
bottom-up costs often need revising upward, 
whereas top-down costs often need revising 
downward. 

Finally, the benefit-cost focused IAMs 
should be continually updated to reflect the 
latest empirical evidence on mitigation costs. 
An expanding literature on the monetary 
damages of climate change focuses on cross- 
validation and calibration between IAMs 
and bottom-up empirical estimates (J5). 
Nevertheless, we are not aware of similar 
efforts that seek to cross-validate top-down 
mitigation cost assumptions with bottom-up 
empirical estimates. And yet, compared to 
the nonmarket effects of climate damages, 
the costs of mitigation are generally more ob- 
servable and immediate, suggesting a prom- 
ising avenue for improved estimation. 

To conclude, the direct monetary costs of 
adopting mitigation activities is informative, 
but incomplete. It ignores real yet indirect 
opportunity costs that are potentially more 
difficult to measure. Distinguishing between 
these two notions of cost is helpful for in- 
terpreting and communicating different 
estimates coming out of the scientific com- 
munity and for setting a research agenda to 
promote greater reconciliation. 
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Surtsey at 60 


A volcanic island off the coast of Iceland celebrates 
six decades as a living laboratory 


By Joe Roman 


efore dawn on 14 November 1963, the 

Isleifur 2 set a bottom longline off the 

southeast coast of Iceland (7). The en- 

gineer on the Icelandic fishing boat 

noticed a strong smell of sulfur. Dark 

smoke emerged from the surface of 
the sea. Soon tephra—ash, cinders, and la- 
pilli—spewed up from the ocean. By the next 
morning, a new island had risen 10 m above 
the surface of the North Atlantic Ocean. 

About 3 months after the eruption, 
Sigurdur Thorarinsson, a professor at the 
University of Iceland, was the first volcanolo- 
gist to arrive on the new island, which was 
later given the name Surtsey, after the Norse 
giant Surtur. Biologists soon followed, track- 
ing the arrival of organisms to this new land- 
form, which would cover about 1.4 km? along 
the Mid-Atlantic Ridge. 

Seabirds began nesting on Surtsey in 
the 1970s. During his first visits, Erling 
Olafsson, a young Icelandic entomologist, 
recorded each bird he found, creating a 
timeline of new arrivals. Black guillemots 
and fulmars were the first birds to breed 
on Surtsey, later followed by great black- 
backed gulls, herring gulls, and lesser 
black-backed gulls. As more seabirds ar- 
rived, white streaks of guano—rich in 
phosphorus and nitrogen, the limiting 
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nutrient for many vascular plants—started 
to build up around the nests. The breeding 
colony is now a lush grassland, dominated 
by lyme grass (Leymus arenarius), Arctic 
fescue (Festuca richardsonii), and common 
meadow grass (Poa pratensis). It receives 
about 30 times more nitrogen each year 
than the birdless lava fields surrounding it 
(2). As a result, plant biomass in the breed- 
ing colony is about 60 times higher than in 
the areas without nesting birds. 
More than 50 of the 78 plants 
recorded on the island were 
likely dispersed by gulls and 
other birds. 

There are plenty of islands in 
the world, but Surtsey is one of 
the youngest, with the lightest 
of human footprints; only a few 
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A photographer captured this image of a lagoon 
on Surtsey soon after the island emerged. 


nesting grounds. Recent work on the Chagos 
Archipelago in the Indian Ocean has shown 
that seabirds influence terrestrial plants, 
coral reef productivity, and fish biomass. 

In the 1970s, a few ecologists began ex- 
amining the shifts in ecosystems when ani- 
mals appeared or disappeared, but it was 
not really until the past decade or so, with 
the emergence of zoogeochemistry—which 
explores how animals influence the flow of 
key elements through living systems and the 
physical environment—that we have seen an 
increased focus on the role played by animals 
in altering landscapes and seascapes (3, 4). 

I visited Surtsey in 2021 while conducting 
research for a book on how animals affect 
the ecosystems in which they live. I had been , 
fascinated by the island since I was a kid and 
realized that there was perhaps no better place 
to explore the role of animals than a young 
volcanic island. Heavy and persistent fog al- 
most forced the cancellation of our trip, but 
in the end, the Icelandic Coast Guard saved 
the day, delivering us to Surtsey by helicopter : 
to join the researchers already in place. It was 
a rare opportunity to observe how animals 
could help build an ecosystem almost from 
scratch. As I write in the resulting book, Eat, ~ 
Poop, Die: “Animals are the beating heart of 
the planet. In the same way that trees work 
as the Earth’s lungs...animals pump nitrogen 
and phosphorus from deep-sea gorges up to 
mountain peaks and across hemispheres from 
the poles to the tropics.” 

On my last day on Surtsey, Borgthor 
Magnitsson, senior researcher at 
the Icelandic Institute of Natural 
History, told me that, soon, the 
island would be the land of the ‘ 
puffin. The seabirds will burrow 
beneath the thick grasses, as 
they do on neighboring steep- 
cliffed islands. The fragile lava 
will slough into the sea, leav- 
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researchers are allowed to visit Eat, Poop, Die: ing behind a hard inner core . 
it each year. In 2008, the island How Animals Make of palagonite, the basaltic glass 
was added to the UNESCO World pup ee that formed when the lava, 
Heritage List because of its sta- Little, Brown Spark, 2023. still hot, flowed into seawater. 
tus as a “pristine natural labora- 288 pp. “Eventually, maybe in ten or 


tory.’ Charlie Crisafulli, a volcano 
ecologist who visited Surtsey after the desig- 
nation, called it “an ecologist’s dreamworld.” 
George Evelyn Hutchinson was one of 
the first biologists to collect and synthesize 
evidence for the ecological importance of 
guano. In The Biogeochemistry of Vertebrate 
Excretion, published in 1950, he built the case 
for protecting large bird colonies by showing 
that seabirds, rather than competing with 
people for food, could increase fish popula- 
tions by enhancing nutrients close to their 


fifteen thousand years,” he said, 
“Surtsey will probably be gone.” 
He let that sink in. “But then we will have 
another eruption and a new Surtsey.” 
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Bold ideas, daunting challenges 


A vibrant history traces the triumphs and missteps 
of quantum, nuclear, and particle physics 


By Paul Halpern 


n our technological society, scientific 

mechanisms have increasingly become 

cloaked behind protective, user-friendly 

interfaces—from the hidden heating and 

cooling systems behind the walls of her- 

metically sealed office buildings to the 
unseen workings of smartphones behind 
their screens. What a joy, therefore, to pick 
up Grace in All Simplicity and encounter 
stunning true-life tales of passionate re- 
searchers grappling with the challenges of 
the natural world in all its rawness and dan- 
ger. Experimental physics, the 
book teaches us, has not been a 
task for the timid. 

Most science histories catalog 
the march of progress by means 
of success stories. In this book, 
Robert Cahn and Chris Quigg 
balance accounts of ground- 
breaking ventures with ample 
descriptions of heartbreaking 
failures, including tragic deaths 
of scientists. They begin their 
chronicle, for instance, with the 
harrowing tale of three Berlin 
physicists who tried, unsuccess- 
fully, to harness lightning strikes 
through a cable draped over 
mountains in the Italian-Swiss 
Alps for the purpose of energiz- 
ing and accelerating subatomic 
particles. One of the research- 
ers, Kurt Urban, was shocked re- 
peatedly and tragically fell from 
a high antenna to his instant 
death on a rockpile—the experiment ending 
in disaster. By vividly and poignantly portray- 
ing such setbacks, the authors help elucidate 
the lesson that scientific triumphs rest on 
scrap heaps of failed attempts. 

Despite such risks, experimentation is 
critically important for scientific progress. 
As the book emphasizes, well-constructed 
experiments with precise, repeated mea- 
surements are typically needed to verify 
or refute far-reaching theoretical predic- 
tions. Although theorists’ prognostications 
should be respected—especially those of 
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brilliant thinkers such as British quan- 
tum physicist and Nobel laureate Paul 
Dirac—sometimes, as the book points out, 
scientists misinterpret the consequences 
of their own theories until they are set 
straight by observation. 

The authors show how Dirac’s concise 
quantum equation for relativistic electrons 
ended up accurately predicting positrons— 
the antimatter counterparts of electrons— 
which Carl Anderson discovered in 1932. 
Initially, however, Dirac had wrongly char- 
acterized that class of solutions to his equa- 
tion—corresponding to “holes” left behind 


Evocative anecdotes abound, featuring physicists such as Paul Dirac, pictured here. 


by electrons as they emerge from a nega- 
tive energy sea—as protons. That scheme 
made little sense because protons are much 
heavier than electrons. 

In a wonderful personal anecdote, Quigg 
recalls conversing with Dirac in his later 
years at a cocktail party after a colloquium 
Quigg delivered at Florida State University 
and boldly asking him “what he could pos- 
sibly have been thinking.” Pressed to justify 
why he misidentified protons, at first, as 
the positive counterparts of electrons in his 
equation, Dirac shrugged it off by saying, 
“In those days, we were a little less ready to 
speculate about new particles.” 

It is such human touches, along with col- 
orful analogies and descriptions, that make 


Grace in All Simplicity 
Robert N. Cahn and 
Chris Quigg 

Pegasus, 2023. 400 pp. 


Grace in All Simplicity shine. The authors 
paint vibrant portraits of so many of the prin- 
cipals involved in the histories of quantum, 
nuclear, and particle physics, along with re- 
lated fields, and describe their contributions 
with exceptional clarity. They are equally 
adept at recounting the drama of early- 
to-mid-20th-century physics, with its reli- 
ance on cosmic ray detectors and relatively 
compact accelerators, and at describing the 
triumphs of the late 20th and early 21st cen- 
turies. The latter—a substantial chunk of the 
book—chronicles the systematic verification 
of the standard model of particle physics by 
means of identifying particles, 
such as the Higgs boson and the 
carriers of the weak interaction, 
in the collision debris of mas- 
sive particle-smashers at CERN, 
Fermilab, and elsewhere. 

One cautionary note is that 
the book often jumps from 
one era to another, from the 
distant past to contemporary 
times and back again—swifter, 
it seems, than sparks leaping 
across frayed wires. For in- 
stance, the start of its seventh 
chapter, “Storytellers” veers 
from discussing the craftsman- 
ship of a retired late-20th-cen- 
tury Caltech physicist, back to 
Renaissance figures Leonardo 
da Vinci and Galileo Galilei, on- 
ward to the 1919 confirmation of 
general relativity in eclipse mea- 
surements, back to James Clerk 
Maxwell’s 19th-century work 
in electromagnetism, and onward again to 
Werner Heisenberg’s uncertainty principle 
proposed in the 1920s—all in the span of a 
few pages. Those lacking a solid knowledge 
of the history of physics might get confused 
by the frequent zigzagging between events 
that happened in various centuries. 

In short, Grace in All Simplicity offers an 
eloquent account of the history of the quest 
for the fundamental ingredients of nature, 
including a taste of the daunting challenges 
and an outstanding glimpse at the various 
personalities involved. However, following 
its circuitous narrative backward and for- 
ward in time makes one empathize with 
particles in a scattering experiment. 

10.1126/science.adl2396 
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Climate change puts 
Amur leopard at risk 


The Amur leopard (Panthera pardus ori- 
entalis) is a leopard subspecies that has 
adapted to the East Asian environment 
near the Amur River, which runs through 
eastern Russia and along the northern 
border of China (7). Because of human 
activities such as mining, logging, road 
building, population growth, and poaching 
(2), the Amur leopard population dropped 
substantially in the 1970s (3). In the years 
since, conservation efforts have increased 
the population to be around 40 individuals 
in China and 70 in Russia (2). However, the 
effects of climate change are exacerbating 
the threats that this Critically Endangered 
(2) species faces. 

Amur leopards live in coniferous- 
deciduous forests, where they rarely share 
territory and hunt alone for small animals 
(2). Although much of their habitat is now 
designated as national parks in southwest- 
ern Primorye Province of Russia and the 
neighboring Jilin and Heilongjiang prov- 
inces of China (4), rising temperatures and 
dry conditions driven by climate change 
have led to increased fires that—combined 
with human activities and landscape frag- 
mentation—have reduced the leopards’ 
territory and increased cub mortality (5, 
6). The spread of wildfires and increased 
temperatures have increased deforestation. 
Together with increased hunting of deer 
and wild boar by humans, this ecosystem 
degradation has led to reduced ungulate 
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prey and more competition between Amur 
leopards and Amur tigers (Panthera tigris 
altaica) (2). As a result, the predators 
invade villages and prey upon livestock, 
increasing human-wildlife conflict and the 
likelihood that leopards will be killed or 
injured (2). 

Climate change has also increased the 
Amur leopard’s vulnerability to disease. 
Increased temperatures, humidity, and fre- 
quency and intensity of extreme weather 
events have reshaped interactions between 
species, facilitating the spread of viruses 
(7). In addition, the loss of habitat and 
stop-over migration areas has caused birds 
to alter migration patterns, leading to 
more contact between birds and mammals 
in East Asia (8). Cross-species transmis- 
sion of disease has increased by an esti- 
mated 4000-fold (9). Amur leopards have 
contracted the potentially lethal canine 
distemper virus (10) as well as the highly 
pathogenic avian influenza H5N1 virus (11). 

To protect Amur leopards from extinc- 
tion, future demographic models should 
include climate change-driven wildfires, 
interspecies competition, human conflicts, 
and infectious diseases when predict- 
ing population declines. Conservation 
responses, including vaccination programs, 
should take these factors into account (0). 
In addition, a multi-pathogen surveillance 
system should be used to speed up the 
detection of highly virulent viral strains 
that infect felids (72). Together, these 
efforts would mitigate the risk of Amur 
leopard extinction. 
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Mining threatens health 
of Panama’s environment 


In October, Panama’s president, Laurentino 
Cortizo, signed a contract with Minera 
Panama, a subsidiary of First Quantum 
Minerals Ltd., which the legislature and 
the executive subsequently approved by 
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law (1, 2). The agreement gives Minera 
Panama the right to use open-pit mining 
to extract copper and other minerals for 
at least 20 years from nearly 13,000 ha of 
land in the Donoso District, Province of 
Colon (2), a protected area in the heart 

of the Mesoamerican Biological Corridor 
on Panama’s Atlantic coast. The contract 
reversed previous laws (3) and disregarded 
citizen opinion and the recommendations 
of scientific institutions (4). Civil unrest 
and strikes have taken place to protest the 
decision. To protect its important ecosys- 
tems, Panama’s government must initiate 
a dialogue with protesters and commit to 
addressing their concerns. 

Open-pit mining requires the complete 
deforestation of the exploited area, making 
it one of the most carbon-intensive extrac- 
tive activities (5). The environmental deg- 
radation caused by deforestation related 
to open-pit mining is linked to declines in 
environmental health and the health of 
wildlife in surrounding forests and man- 
groves (6, 7). Open-pit mining also threat- 
ens public health by contaminating water 
sources and polluting air and soil with 
heavy metals (7, 8). Land use changes in 
mining areas lead to the loss of farmland, 
increasing food insecurity and poverty 
in local communities (8). Yet, Panama’s 
contract allows open-pit mining in an area 
protected by national and international 
environmental regulations (9). 

Environmental degradation in this 
region of Panama would undermine global 
efforts to mitigate climate change. Panama 
is one of only three carbon-negative coun- 
tries in the world (10), partly because more 
than 40% of its territory is covered by 
forests (11). However, the country has weak 
environmental institutions and lacks the 
budget for effective monitoring and man- 
agement of its protected areas (12). 

With increased mining in protected 
areas and limited regulations and enforce- 
ment, Panama is unlikely to remain carbon 
negative. To protect its vital ecosystems, 
the recent contract and related law must 
be repealed. Instead of allowing mineral 
extraction, Panama should formulate 
environmental policies based on scientific 
evidence, sufficiently consult with its resi- 
dents, and prioritize the preservation of its 
protected forests by investing in sustain- 
able development. 
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Legislative inertia fails 
Brazil’s Cerrado 


The Cerrado, covering approximately 22% 
of the Brazilian territory, is an ecologically 
crucial biome (7). The region contains 
rivers such as the Parana-Paraguay, 
Araguaia-Tocantins, and Sao Francisco, 
as well as the upper catchments of large 
Amazon tributaries, such as the Xingu and 
Tapajos (2). About 12,000 plant species and 
2500 animal species, among which 20% 
are exclusive to this habitat and roughly 
130 are endangered, live in the Cerrado 
(3). However, only 8.21% of the Cerrado 
is legally protected through Conservation 
Units (4), and human pressures, particu- 
larly agricultural expansion, have been 
responsible for the deforestation of 6 mil- 
lion hectares of native vegetation over the 
past decade (5). Brazil must pass legisla- 
tion to protect this vital region. 

Two legislative proposals that could 
mitigate deforestation in the Cerrado 
have been under discussion in the Federal 
Senate for more than 4 years. Bill No. 
1459/2019 (6) would amend Law No. 
12.651/2012 to increase the areas designated 


for the protection of native vegetation by 
35%. Bill No. 4203/2019 (7) would suspend 
deforestation authorizations in the region 
for a decade from the date of its approval. 
A third proposal, Bill No. 1600/2019 (8), 
which aimed to prioritize National Fund 
for the Environment resources to enhance 
and restore Cerrado biome environmental 
quality, was rejected in April 2023 (9). 

Brazil’s failure to pass these proposed 
laws impedes the implementation of effec- 
tive measures to mitigate deforestation in 
the Cerrado. Brazil’s representatives must 
commit to reversing this trend. Future 
legislative initiatives should consider the 
biome not only as a national heritage but 
as a pivotal component for environmental 
balance. Such initiatives could focus on 
passing the bills described, introducing 
new proposals with similar goals, oracom- , 
bination of both. 

Preserving the Cerrado demands con- 
crete actions, including investments in 
research, implementation of sustainable 
practices, rigorous monitoring to curb 
deforestation, and an effective commit- 
ment within the legislative bodies. Brazil’s 
citizens, and especially its scientists, should 
hold their government accountable. 
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Translation 
inhibition in 
viral defense 


rokaryotic type III 
CRISPR-Cas systems are 


complex and efficient barriers that 

protect host cells against foreign 

nucleic acids. To cope with viral 
infections, these systems often combine 
CRISPR RNA-guided viral RNA and DNA 
cleavage with the synthesis of cyclic 
oligoadenylate (cA,) signaling molecules, 
which activate a diverse range of auxiliary 
proteins that reinforce CRISPR-Cas defense. 


Mogila et al. identified a cA,-dependent effector (named Camil) that, 
in response to type Ill CRISPR-Cas signaling, cleaves messenger RNA 
in atranslating bacterial ribosome, thereby blocking protein synthesis 


and cell growth. Camil uses an active ribosome stalk-dependent capture 
mechanism, which is similar to that of eukaryotic antiviral ribosome-inactivating 


proteins. —DJ 
Science, adj2107, this issue p. 1036 


Boron radicals go 


asymmetric 


A good catalyst must bind 
tightly enough to reactants to 
bias their reactivity but then 
loosen its grip sufficiently to 
release the products. Radical 
catalysts with unpaired elec- 
trons are rare, in large part 
because the release step is 
too unfavorable. Wang et al. 
now report that boron radicals, 
generated in situ from boranes 
bound to chiral carbene ligands, 
can achieve this balancing 
act as effective asymmetric 
catalysts for the cyclization of 
alkyne compounds, forming a 
variety of nitrogen heterocycles 
of interest in medicinal chemis- 
try research. —JSY 

Science, adg1322, this issue p. 1056 
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Resident regs 

Regulatory T (Treg) cells in lymph 
nodes play a key role in immune 
tolerance, but whether they 

can form long-lasting memory 
is controversial. To study the 


circulatory kinetics of these cells, 


Kaminski et al. used a photo- 
convertible mouse model that 


enabled long-term tracking of 
lymph node Treg cells in vivo. Most 
of these cells were long-lived 
memory-like cells that remained 
in the lymph nodes for months 
and accumulated with age. These 
resident-memory Treg Cells were 
functionally heterogeneous 

and had transcriptional profiles 
similar to those of conventional 


Fluorescence microscopy image of a lymph node stained to reveal different T cell types 
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An antiviral defense protein 
(pink) in bacteria blocks 
protein synthesis by binding 
to ribosomes (purple) 
and cleaving mRNA. 


resident memory T cells. ‘ 
Individual lymph nodes contained 
Treg Cell clones with a distinct 
T cell receptor repertoire, and 
clones were not shared between 
lymph nodes. —HMI 
Sci. Immunol. (2023) 
10.1126/sciimmunol.adj5789_—. 


Two-faced receptor 

Ephrin type-A receptor 2 (EphA2) 
is areceptor for membrane- 
bound ephrin ligands that 
mediates signaling between 

cells that contact one another. 
EphA2 is unusual in that both 
the bound and unbound states 
are involved in signaling—the 
former produces tumor sup- 
pressive effects, and the latter 
promotes oncogenesis. Shi et al. 
observed structural properties of 
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EphA2 in live monkey kidney cells 
by an advanced time-resolved 
fluorescence spectroscopy 
method called pulsed inter- 
leaved excitation—fluorescence 
cross-correlation spectroscopy 
(PIE-FCCS). This method enabled 
the measurement of receptor 
diffusion and oligomerization. 
The unbound receptors formed 
multimers that kept the recep- 
tor’s own kinase domains apart 
and promoted signaling through 
other associated kinases. Ligand 
binding promoted conformational 
changes into distinct clusters 
in which the intrinsic kinase 
domains engaged in transphos- 
phorylation of receptor tyrosine 
residues. —LBR 

Science, adg5314, this issue p. 1042 


CLIMATE SCIENCE 
Growing stronger 
every day 


The effect of increasing the 
concentration of atmospheric 
carbon dioxide (COz) on global 
average surface air temperature 
might be expected to be constant. 
However, He et al. found that 
this is not the case. Doubling the 
atmospheric COz concentration 
increases the impact of any given 
increase in COz by about 25%, 
owing to changes induced in the 
climatological base state. The 
more anthropogenic CO2 emis- 
sions raise the atmospheric COz 
concentration, the more serious 
the consequences will be. —HJS 
Science, abq6872, this issue p. 1051 


METALLURGY 
Unexpected rotation 


Determining exactly how 
materials deform is key to better 
engineering and design. Using a 
three-dimensional microscopy 
technique, He et al. tracked 
crystal grains in polycrystal- 

line nickel through deformation 
cycles. The authors found that 
these grains not only rotate in 
both directions but also some- 
times experience internal rotation 
of the lattice. This unexpected 
lattice rotation occurs more often 
in the larger grains and is due toa 
phenomenon called back stress, 
which drives dislocation slip 


that generates the rotation. This 
discovery represents an impor- 
tant and distinct mechanism for 
accommodating deformation 
and could lead to better material 
design. —BG 

Science, adj2522, this issue p. 1065 


HUBBARD MODEL 
A tough law to break 


Close to absolute zero, the ratio 
of thermal and charge conduc- 
tivities of many materials has 
been found to be proportional to 
the temperature. This behavior, 
referred to as the Wiedemann- 
Franz law, is expected if the same 
quasiparticles are the carriers of 
both heat and charge transport. 
In strongly correlated materials, 
however, quasiparticles may no 
longer be well defined. Wang et 
al. undertook a comprehensive 
numerical study of heat and 
charge transport within the 
doped Hubbard model, which 
incorporates strong interactions. 
Surprisingly, they found that as 
they lowered the temperature as 
far as their numerical methods 
allowed, the system approached 
the Wiedemann-Franz law limit. 
—JS 

Science, ade3232, this issue p. 1070 


IMMUNOLOGY 
Perturbing T cell actin 
with PD-1 


When engaged by its ligands, 
the receptor programmed cell 
death-1 (PD-1) inhibits the activa- 
tion of T cells stimulated through 
the T cell receptor. Alleviating this 
effect is the basis of checkpoint 
inhibitors that promote antitumor 
immunity. Paillon et a/. show that 
PD-1 signaling inhibits the forma- 
tion of an immunological synapse 
between T cells and their target 
cells by preventing the remodel- 
ing of the actin cytoskeleton in 
a manner independent of its 
tyrosine-based signaling motifs 
(see the Focus by Acharya and 
Kumari). These findings suggest 
an alternative mechanism by 
which PD-1 regulates immune 
responses. —JFF 
Sci. Signal. (2023) 
10.1126/scisignal.adh2456, 
10.1126/scisignal.adl3956 
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BEHAVIORAL ADAPTATION 
Snowy insect barrier 


old-adapted species can benefit from snow that 

persists across seasons. As human-induced climate 

change increases temperatures, such persistent snow 

patches are being reduced, with unknown impacts on 

these species. Hayes and Berger used visual observa- 
tions of animals in the field to test whether mountain goats 
(Oreamnos americanus) in two sites in North America’s Rocky 
Mountains displayed reduced respiration rates, a sign of 
reduced thermal stress, when on snow patches. They found 
no evidence for respiratory rate differences, perhaps owing 
to the countering effect of albedo on the cool snow; however, 
animals experienced reduced insect harassment, as mea- 
sured by ear flicks, when lying on snow. —SNV 


PNAS Nexus (2023) 10.1093/pnasnexus/pgad339 


Mountain goats rest on snow patches to avoid the attention 


of biting insects. 


PARENTAL CARE 
Care with 
a brood chamber 


Bryozoa, or moss animals, 
are a phylum of filter-feeding 
aquatic invertebrates that live in 


sedentary colonies. Interestingly, 


several species raise their young 


in a brood chamber. Grant et al. 
investigated the evolutionary 
history of this form of parental 
care in an order of bryozoans 
called Cheilostomatida, which 
are known to have distinct types 
of incubation chambers. The 
phylogenetic tree that they 
constructed using genomic 
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HUMAN GENETICS 
Hotspots are hotbeds of 
mutation 


Meiotic recombination allows 
for new combinations of 
alleles to arise in populations 
and, thus, for selection to act 
more effectively. However, this 
process is also known to be 
inherently mutagenic, causing 
high turnover of recombina- 
tion hotspots during evolution. 
Hinch et al. assembled variant 
data to create high-resolution 
mutational profiles near recom- 
bination hotspots in humans 
(see the Perspective by Baudat 
and de Massy). They found that 
these mutation profiles fit with 
known biases of DNA repair 
mechanisms, happen at higher 
frequency than previously 
thought, and are consistent with 
the higher burden of mutation 
in the paternal germline. This 
study offers a detailed look 
at mutational processes and 
enumerates the burden of de 
novo mutations due to meiotic 
recombination. —CNS 

Science, adh2531, this issue p. 1012; 

see also adl2021, p. 997 


MAIZE GENETICS 
Admixture and the 
origins of maize 


Domestication of plants and 
animals is often characterized 
by selection for specific traits 
interspersed with introduction 
of new desirable traits from wild 
relatives. Yang et al. examined 
genetic data from more than 
1000 varieties of maize and 
related species to clarify the 
complex origins of this agricul- 
tural staple. They found evidence 
that after initial domestication, 
introgression from a relative of 
domesticated maize, Zea mays 
ssp. mexicana, occurred in the 
highlands of Mexico before 
propagating across Central 
America. Alleles from this wild 
relative affect photoperiodicity 
and flowering time, which sug- 
gests that traits from Zea mays 
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ssp. mexicana may have been 
beneficial during domestication. 
These results demonstrate the 
importance of broad sampling in 
elucidating the history of domes- 
ticates. —CNS 

Science, adg8940, this issue p. 1013 


ENZYMOLOGY 
Watching DNA 
photolyases at work 


Ultraviolet radiation can gener- 
ate lesions in DNA in which 
two adjacent bases become 
covalently linked as a cyclobu- 
tene pyrimidine dimer. However, 
the enzyme that fixes these 
lesions also takes advantage of 
the energy in light to initiate the 
repair reaction. Two independent 
groups, Maestre-Reyna et al. 
and Christou et al., used time- 
resolved crystallography at an 
x-ray free electron laser to cap- 
ture the process of DNA repair 
from picoseconds to microsec- 
onds (see the Perspective by 
Vos). A bent flavin conformation 
appears very early after the 
excitation pulse, and the enzyme 
constrains the cofactor confor- 
mation to favor electron transfer 
to the lesion rather than deex- 
citation. The cyclobutane dimer 
is cleaved one bond at a time 
with a transient intermediate 
predominating at 1 nanosecond. 
After cleavage, the separated 
bases occupy more space in 
the active site than in the lesion, 
which facilitates dissociation 
from the enzyme. —MAF 
Science, add7795, adj4270, this issue 
p.1014, 1015; 
see also adl3002, p. 996 


SLEEP 
Ten thousand winks 


Anyone who has driven late 

at night will likely recognize 
microsleeps, those seconds- 
long “sleep” bouts that pop, 
unwelcome, into our otherwise 
wakeful attention. In some 
contexts, such interruptions of 
wakefulness are dangerous, but 
if they provide cumulative sleep 
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benefits, they could be useful in 
animals that otherwise need to 
be continually vigilant. Libourel 
et al. tested this hypothesis in 
breeding chinstrap penguins 
(Pygoscelis antarcticus) using 
remote electroencephalogram 
monitoring (see the Perspective 
by Harding and Vyazovskiy). 
The penguins nodded off more 
than 10,000 times a day, for only 
around 4 seconds at a time, but 
still managed to accumulate 
close to 11 hours of sleep. Their 
breeding success suggests that 
this strategy allows them to get 
the sleep they need. —SNV 
Science, adhO771, this issue p. 1026; 
see also adl2398, p.994 


CALORICS 
A polarizing sacrifice 


Electrocaloric materials can 
pump heat in response to a 
changing electric field, which 
makes them useful in solid-state 
cooling applications. Zheng et 
al. discovered that a very large 
ectrocaloric effect emerges 
in a terpolymer when pores are 
introduced with a sacrificial 
organic crystal with a low boil- 
ing temperature. The polymer 
interface around the pores has 
a large fraction of polarizable 
material, which gives rise to the 
large electrocaloric effect. The 
authors show that this porous 
material is stable after cycling it 
through an electric field 3 million 
times. —BG 

Science, adi7812, this issue p. 1020 
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EXOPLANETS 
Small star, big exoplanet 


Planets form in protoplanetary 
disks of gas and dust around 
young stars that are undergoing 
their own formation process. 
The amount of material in the 
disk determines how big the 
planets can grow. Stefansson 

et al. observed a nearby low- 
mass star using near-infrared 
spectroscopy. They detected 
Doppler shifts due to an orbiting 
exoplanet of at least 13 Earth 
masses, which is almost the 


mass of Neptune. Theoretical 
models do not predict the 
formation of such a massive 
planet around a low-mass star 
(see the Perspective by Masset). 
The authors used simulations to 
show that its presence could be 
explained if the protoplanetary 
disk were 10 times more massive 
than expected for the host star. 
—KTS 

Science, abo0233, this issue p. 1031; 

see also adl3365, p. 999 


IMMUNOLOGY 
Selectively ablating skin 
Tru cell subsets 


Tissue-resident memory T (Trm) 
cells are long-lasting memory T 
cells that occupy various tissue 
niches and play assorted spe- 
cialized functions. Two important 
Tru cell subsets in human skin 
are interferon-y—producing 
CD8* Trmil cells, which have 
antiviral and anticancer roles, 
and interleukin-17—-secreting 
CD8* Trml7 cells, which partici- 
pate in antibacterial immunity 
and wound-healing responses. 
However, both Try cell subtypes 
can also contribute to skin 
pathologies. Studying mice, Park 
et al. established that skin Tayl 
cells require a different signaling 
pathway for their differentiation 
relative to Tryl7 cells. Trml7 cells 
could be selectively ablated by 
targeting elements of the signal- 
ing axis that is involved in Trl 
development. —STS 


Science, adi8885, this issue p.1073 . 


CORONAVIRUS 

Calling on cellular 
immunity 

Protection against severe acute 
respiratory syndrome coronavi- 
rus 2 (SARS-CoV-2) is primarily 
thought to be mediated by 

B cells and antibodies. However, 
T cell responses are also acti- 
vated by SARS-CoV-2 infection 
and vaccination and may be 
especially important in protect- 
ing individuals who lack B cells. 
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Zonozi et al. characterized the 
T cell response to SARS-CoV-2 
vaccination and infection in indi- 
viduals who lack B cells because 
of treatment with rituximab or 
a primary immunodeficiency. 
Not only were virus-specific 
CD4* and CD8* T cells elicited 
by both infection and vaccina- 
tion, but the T cells exhibited 
greater reactivity and prolifera- 
tive capacity as compared with 
controls. The authors evaluated 
clinical outcomes of SARS-CoV-2 
infection in B cell—deficient 
individuals and observed that 
vaccination still reduced the risk 
of severe disease. These results 
demonstrate a protective role 
for SARS-CoV-2-specific T cells 
and highlight the importance of 
SARS-CoV-2 vaccination in B 
cell—deficient patient popula- 
tions. —CM 
Sci. Transl. Med. (2023) 
10.1126/scitranslmed.adh4529 
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sequences shows that different 
types of brood chambers likely 
evolved at least 10 times from 
ancestors who might have been 
free spawning. It seems that the 
selection pressure experienced 
by this group of animals favors 
more parental care rather than 
less. —DJ 

Proc. R. Soc. Lond. Ser. B (2023) 

10.1098/rspb.2023.1458 


HOST DEFENSE 


Immune checkmate 
Virulence factors produced by 
Candida albicans contribute to 
its pathogenicity and ability to 
infect the brain but inadvertently 
may facilitate the detection of 
the fungus by brain-resident 
immune cells. Wu et al. explored 
mechanisms by which aspartic 
proteinases, called Saps, and the 
cytolytic toxin candidalysin may 
activate microglia. In vitro, Saps 
cleaved amyloid precursor pro- 
tein (APP), a molecule expressed 
by neurons, to generate peptides 
that activated antifungal activ- 
ity through Toll-like receptor 4 
(TLRA). In addition, the authors 
found that candidalysin could 
bind to the integrin CD11b 
and that loss of either protein 
impaired the clearance of 
C. albicans from the brains of 
infected mice. —SHR 
Cell Rep. (2023) 
10.1016/j.celrep.2023.113240 


PHYTOPLANKTON 
Measuring the right stuff 


Marine phytoplankton produce 
dimethyl! sulfide (DMS), which 
is oxidized to methanesulfonic 
acid (MSA) and sulfate in the 
atmosphere. Because of this, 
the MSA content of ice cores 
from Greenland predictably has 
been used to infer the abun- 
dance of phytoplankton in the 
North Atlantic. Jongebloed et 
al. determined the concentra- 
tion of sulfate derived from 
DMS and compared it to that of 
DMS-derived MSA in Greenland 
ice and show that although the 
sum of MSA and DMS-derived 
sulfate is relatively steady over 
the Industrial Era, their propor- 
tions have changed because of 
the impact of anthropogenic 
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pollution on atmospheric chemis- 
try. Thus, atmospheric chemistry 
must be considered when using 
MSA as a proxy for phytoplankton 
abundance. —HJS 
Proc. Natl. Acad. Sci. U.S.A. (2023) 
10.1073/pnas.2307587120 


CORAL REEFS 


Lab-made heat tolerance 
Coral reefs provide habitat for 
many species but are threatened 
by global changes, including 
warming seas. The algae in 
corals, which provide them 

with essential nutrients, vary in 
their physiological tolerances 
and have a strong influence 
over the response of corals to 
heat stress. Chan et al. show 
that microalgal symbionts 

that evolved under increased 
temperatures in the laboratory 
successfully colonized some 
Galaxea fascicularis corals and 
provided them with greater heat 
tolerance. The laboratory-bred 
microalgae (Cladocopium pro- 
liferum) performed similarly to 
a naturally heat-tolerant species 
(Durusdinium sp.) but provided 
corals with faster growth and 


recovery after bleaching and 
showed evidence of persistence 
over multiple years. —BEL 
Glob. Chang. Biol. (2023) 
10.1111/gcb.16987 


MESOSCOPIC PHYSICS 


Adouble edge 


Quasiparticles that reside 

in two-dimensional solids 

can display exotic quantum 
statistics—they do not behave 
as either fermions or bosons. 
Additionally, they carry a charge 
that is a fraction of an electron’s 
charge. Recently, these frac- 
tional quantum statistics were 
seen experimentally in interfero- 
metric and quasiparticle collider 
measurements in fractional 
quantum Hall (FQH) systems. 
Such experiments become more 
complex if more than one FQH 
edge state is present. Nakamura 
et al. studied a state formed at 
filling factor 2/5, which is pre- 
dicted to have two edge states. 
The researchers’ conductance 
measurements in a Fabry-Pérot 
interferometer confirmed that 
the outer edge of this state is 
similar to that of the simpler 1/3 
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filling state; the inner edge state 
was found to have a fifth of the 
electron’s charge and to obey 
fractional statistics of the kind 
predicted by theory. —JS 
Phys. Rev. X (2023) 
10.1103/PhysRevxX.13.041012 


CHEMICAL SAMPLING 
Under the sea 


Marine organisms, both large 
and microscopic, are important 
sources of complex natural 
products and metabolites, many 
of which have biological activity. 
Collecting seawater in situ to 
recover these molecules is a 
daunting analytical challenge. 
Mauduit et al. developed an 
underwater sampling device 
that can be installed adjacent to 
corals and sponges to capture 
released soluble compounds by 
solid-phase extraction. Combined 
with untargeted high-resolution 
mass spectrometry, this 
approach was able to capture 
many diverse natural products in 
seawater that can be associated 
with specific species. -MAF 
ACS Cent. Sci. (2023) 
10.1021/acscentsci.3cO0661 


plugs can be colonized 
with algal symbionts 
that tolerate — 
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ENZYMOLOGY 


Visualizing the DNA repair process by a photolyase 


at atomic resolution 


Manuel Maestre-Reyna*, Po-Hsun Wang, Eriko Nango, Yuhei Hosokawa, Martin Saft, Antonia Furrer, 
Cheng-Han Yang, Eka Putra Gusti Ngurah Putu, Wen-Jin Wu, Hans-Joachim Emmerich, Nicolas Caramello, 
Sophie Franz-Badur, Chao Yang, Sylvain Engilberge, Maximilian Wranik, Hannah Louise Glover, 
Tobias Weinert, Hsiang-Yi Wu, Cheng-Chung Lee, Wei-Cheng Huang, Kai-Fa Huang, Yao-Kai Chang, 
Jiahn-Haur Liao, Jui-Hung Weng, Wael Gad, Chiung-Wen Chang, Allan H. Pang, Kai-Chun Yang, 
Wei-Ting Lin, Yu-Chen Chang, Dardan Gashi, Emma Beale, Dmitry Ozerov, Karol Nass, Gregor Knopp, 
Philip J. M. Johnson, Claudio Cirelli, Chris Milne, Camila Bacellar, Michihiro Sugahara, Shigeki Owada, 
Yasumasa Joti, Ayumi Yamashita, Rie Tanaka, Tomoyuki Tanaka, Fangjia Luo, Kensuke Tono, 

Wiktoria Zarzycka, Pavel Miiller, Maisa Alkheder Alahmad, Filipp Bezold, Valerie Fuchs, Petra Gnau, 
Stephan Kiontke, Lukas Korf, Viktoria Reithofer, Christian Joshua Rosner, Elisa Marie Seiler, 
Mohamed Watad, Laura Werel, Roberta Spadaccini, Junpei Yamamoto, So Iwata, Dongping Zhong, 
Jérg Standfuss, Antoine Royant, Yoshitaka Bessho*, Lars-Oliver Essen*, Ming-Daw Tsai* 


INTRODUCTION: Dimerization of thymine bases 
to form a cyclobutane-pyrimidine dimer (CPD) is 
acommon ultraviolet light-induced DNA lesion. 
Photolyases catalyze light-triggered repair of CPD- 
DNA, thus contributing to genome stability in 
many organisms. Combining time-resolved crys- 
tallography and computational analyses, we report 
an atomic visualization of the photolyase-catalyzed 
DNA repair process. We captured electron transfer 
at low picoseconds, chemical steps at picoseconds 
to nanoseconds, active-site recovery at nanosec- 
onds to microseconds, and reannealing of the 
double-stranded DNA (dsDNA) at submillisec- 
onds, forging new ground in DNA repair, struc- 
tural biology, and enzymology. 


RATIONALE: Mechanistic models from previous 
spectroscopic studies set a framework for using 
time-resolved serial femtosecond crystallog- 
raphy (TR-SFX) to visualize this catalytic pro- 
cess. This technique provides a concise view of 
not only the repair chemistry but also hitherto 
unknown postrepair events. 


RESULTS: Two series of TR-SFX experiments 
were performed, one from picoseconds to nano- 
seconds and the other from nanoseconds to 
microseconds (see the figure). Our visualization 
of the repair of a CPD begins at 100 ps, with 
Are” (R256) becoming dynamic and moving 
to stabilize the CPD, which suggests the initia- 


q 


tion of the forward electron transfer from Chee 

buns ; updz 
reduced, anionic hydroquinone state of fll 
adenine dinucleotide (FADH ) to the CPD. At 
650 ps, the C5-C5’ bond of the CPD is predom- 
inantly split, and at 1 ns, the C6-C6’ is likewise 
split. Recovery of R256, a five-water cluster 
(5WC), and the FADH coenzyme occurs dur- 
ing the next 500 ns, returning to their re- 
spective resting-state conformations. The 
repaired thymine bases remain in the active 
site during this time and then start to return 
to reanneal with the dsDNA in the microsec- 
ond range. The 200-us structures show the 
coexistence of a back-flipping intermediate 
and the reannealed product before its final 
release from the enzyme. 


CONCLUSION: Our results uncover the atomic 
mechanism of how DNA photolyases repair 
DNA in real time. These data reveal an _ 
ordered breaking of the covalent C-C bonds 
and opening of the cyclobutane ring, as well 
as the concomitant conformational changes 
of the photolyase and its FAD coenzyme. De- 
fined intermediates were also captured when 
the enzyme-product complex was recovering 
and repaired product bases were departing ‘ 
from the active site to pair with their com- 
plementary bases in the dsDNA. 


The list of author affiliations is available in the full article online. 
*Corresponding author. Email: mmaestre@ntu.edu.tw (M.M.-R.); 
bessho@spring8.orjp (Y.B.); essen@chemie.uni-marburg.de 
(L-0.E.); mdtsai@gate.sinica.edu.tw (M.-D.T.) 

Cite this article as M. Maestre-Reyna et al., Science 382, 
eadd7795 (2023). DOI: 10.1126/science.add7795 
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Elucidation of the main processes and key intermediates in the DNA repair reaction catalyzed by photolyase. Selected intermediates (cyan) of the repair process 
are overlaid with the structure of the dark state (gray) to illustrate structural changes during catalysis. T7 and T8 are the thymine-7 and thymine-8, the damaged 5'- 
and 3'-thymines of the CPD lesion in the DNA strand. TT. refers to the two thymines together. 
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at atomic resolution 
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Photolyases, a ubiquitous class of flavoproteins, use blue light to repair DNA photolesions. In this work, we 
determined the structural mechanism of the photolyase-catalyzed repair of a cyclobutane pyrimidine dimer 
(CPD) lesion using time-resolved serial femtosecond crystallography (TR-SFX). We obtained 18 snapshots 
that show time-dependent changes in four reaction loci. We used these results to create a movie that 
depicts the repair of CPD lesions in the picosecond-to-nanosecond range, followed by the recovery of the 
enzymatic moieties involved in catalysis, completing the formation of the fully reduced enzyme-product 
complex at 500 nanoseconds. Finally, back-flip intermediates of the thymine bases to reanneal the DNA were 
captured at 25 to 200 microseconds. Our data cover the complete molecular mechanism of a photolyase and, 
importantly, its chemistry and enzymatic catalysis at work across a wide timescale and at atomic resolution. 


hotolyases catalyze the repair of damaged 

DNA that contains ultraviolet (UV) light- 

induced lesions, including cyclobutane 

pyrimidine dimers (CPDs) and the 6-4 

photoproduct (6-4PP) (/, 2). CPD lesions 
are the most abundant type of DNA damage 
that occurs in nature upon UV-B exposure and 
are a major cause of skin cancer in humans (3). 
Catalysis by CPD photolyases involves multi- 
ple redox reactions and a multistep splitting 
of the cyclobutane ring (4). The enzyme is first 
activated through photoreduction of its coen- 
zyme flavin adenine dinucleotide (FAD) from the 
oxidized (FAD,,) to the reduced state (FADH ) 
(5). This process requires two single-electron 
transfer steps mediated by an electron transfer 
chain that involves three tryptophan residues, 
yielding, respectively, the radical semiquinone 


intermediate (FAD” or its protonated form 
FADH”) and the reduced, anionic hydroquinone 
state (FADH ) (6). Only the latter is capable of 
catalyzing light-induced DNA repair. The mech- 
anisms of photoreduction and DNA repair have 
been elucidated by extensive spectroscopic and 
theoretical studies over the past three decades 
1, 5, 7-11). However, relatively limited informa- 
tion from high-resolution crystal structures is 
available, and, apart from the apo form (12-/4), 
all such structures were obtained using synchro- 
tron x-ray radiation that induced both photo- 
reduction (15-17) and DNA repair reactions (18). 

Recently, we used time-resolved serial femto- 
second crystallography (TR-SFX) to elucidate 
the structural mechanism of the photoreduc- 
tion processes in the Methanosarcina mazei 
class II CPD photolyase (MmCPDII) (72). We 


identified an Asn/Arg-Asp redox sensor triad 
that regulates FAD rehybridization and proto- 
nation and observed buckling and twisting of 
the isoalloxazine ring of the coenzyme FAD, 
which occurred in the submicrosecond regime 
after the light-triggered electron transfer step. 
The TR-SFX study of the photoreduction pro- 
cess set the stage for us to study the main func- 
tion of the MmCPDII photolyase—the repair 
of CPD lesions, which we present here. 

On the basis of extensive spectroscopic and 
computational analyses (J, 2, 4, 5, 19, 20), the 
repair of the cyclobutane-bridged TT dimer 
(T<>T, 1 in Fig. 1A) by photolyases was sug- 
gested to go through three intermediates. First, 
a forward electron transfer (FET) from the re- 
duced coenzyme FADH’ produces (T<>T)” (2), 
where the excess electron may be delocalized 
between the two pyrimidines. The resulting 
CPD radical anion then undergoes cleavage of 
its C5-C5’ bond to produce (T_T)" (8) and fur- 
ther cleavage of the C6-C6’ bond to give (T+T)~ 
(4). The latter then transfers an electron back 
to the flavin coenzyme to produce the repaired 
thymine bases (T+T, the product, designated 
as 5 in Fig. 1A). Although such a repair mech- 
anism is chemically feasible and well supported 
by spectroscopic data, structures of its reaction 
intermediates await elucidation in the enzyme- 
bound form. Furthermore, given the moderate 
quantum yields for DNA repair, at least in class IT 
photolyases, the underlying interactions between 
reaction intermediates, the isoalloxazine ring 
of the coenzyme, and active-site residues of the 
photolyase remain to be revealed at high reso- 
lution. The first goal of this TR-SFX study was 
hence to identify and characterize five reaction 
intermediates that correspond to the chemical in- 
termediates shown in Fig. 1A in the DNA-enzyme 
complex, Inti to Int5. Their assignment was based 
on observed changes of the coenzyme’s geom- 
etry, conformations of active-site residues, and 
the chemical state of the CPD during its repair. 

The chemical reactions and electron trans- 
fers shown in Fig. 1A usually occur at the ps-ns 
timescale, but we are interested in the com- 
plete catalytic cycle of the enzyme, which addi- 
tionally involves slower conformational steps 
in the ns-us range. When complexed to photo- 
lyases, the TT dimer in UV-damaged DNA, which 
retains base pairing with adenine bases in 
the unbound state, though somewhat twisted 
(2D, flips out of double-stranded DNA (dsDNA) 
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Fig. 1. Proposed intermediates during repair of T<>T lesions as catalyzed by DNA photolyases. 

(A) Canonical scheme of DNA repair by CPD photolyases. 3'-T, 3'-thymine; 5'-T, 5' thymine. (B) The CPD 
lesion-containing dsDNA used in this work, which contains the cis-syn CPD with its phosphodiester linkage 
(22). (C) Illustration of the pc and py dihedral angles of the FADH” isoalloxazine moiety. As shown by 
TR-SFX of photoreduction, the isoalloxazine moiety of FAD in MmCPDII undergoes symmetrical buckling or 
bending (p¢ = py) or asymmetrical twisting (p¢ # py) upon redox change (12). Most importantly, the 


transition from one geometry to another is orders of 


to interact with the FAD coenzyme and active- 
site residues. This double flip causes a strong 
kink in the canonical B-DNA structure and 
creates an unpaired bubble that is stabilized 
by the photolyase’s bubble-intruding region 
(BIR) (22). Accordingly, a second goal of this 
TR-SFX study was to identify the conformational 
intermediates (Con) in the inverse conforma- 
tional changes that are required for dsDNA re- 
annealing and dissolution of the DNA-photolyase 
complex after completion of the chemical CPD 
repair. This analysis may be particularly reward- 
ing given that very little dynamic information 
is available for the late part of the reaction cy- 
cle and that photolyases bind already repaired 
dsDNA >4 orders of magnitude more weakly 
than they bind UV-damaged dsDNA (23). 


Results 
Tailoring TR-SFX to uncover the 
photolyase mechanism 


Our DNA repair study depended on cocrystals 
of fully reduced MmCPDII and CPD-containing 
DNA (Fig. 1B) (22), which were grown within 
12 hours before use directly at x-ray free elec- 
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magnitude slower than electron transfer itself. 


tron laser (XFEL) sites using anaerobic tents 
and safety-light conditions. Two sets of TR-SFX 
data were collected: The first covers the CPD 
repair reaction itself from 100 ps to 10 ns (the 
ps-ns series), as performed at SwissFEL, where 
crystals were excited by a 0.98-ps-long, 10-uJ, 
400-nm pump laser pulse with a 50-um diam- 
eter focal spot. For the second time series, 
which covers relaxation of the photolyase-pro- 
duct complex and DNA release [10 ns to 200 us, 
the ns-s series, SACLA (SPring-8 Angstrom 
Compact free-electron Laser)], samples were 
excited by a 3-ns-long, 150-uJ, 408-nm pump 
laser pulse with a 100-um focal spot diameter. 
Although illumination at ~400 nm likely gen- 
erates a mixture of S1 and S2 excited FADH * 
species, the S2—S1 decay has been noted to 
occur within a few ps, considerably faster than 
FET (24). 

For photolyase-mediated DNA repair, TR- 
SFX conditions were optimized for achieving 
anticipated DNA repair yields, as detailed in 
supplementary text SI and figs. S1 and S2. A 
brief summary is provided here: (i) We confirmed 


that the oxidized form of MmCPDII was unable 
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to initiate FET toward CPD and failed to cause 
any effects, including repair of the CPD under 
very high laser powers, both under ps-ns and 
ns-us conditions (fig. S1, A and B). (ii) Before per- 
forming TR-SFX experiments that addressed 
CPD repair itself, we validated that difference 
density signatures of the 10-ns TR-SFX structure 
(fig. SIE) coincided with those initially reported 
from x-ray radiation-triggered repair (fig. S1C) 
(77, 18) and low-dosage light-emitting diode (LED)- 
induced repair (fig. S1D). (iii) The reduction of 
laser power from 10 to 5 uJ in the ps-ns series 
caused diminished signature signals (fig. S2B). 
(iv) Control experiments for light contamination 
assured clean dark datasets (fig. S2, A and B). 
In TR-SFX experiments, the power and ener- 
gies of laser excitation pulses generally exceed 
one absorbed photon per chromophore, thus 
implying potential multiphoton effects in light- 
triggered systems, which may affect the photo- _ 
chemical reaction course and dynamics (25-30). 
Additionally, multiphoton absorption can result 
in sample heating due to chromophore relaxa- 
tion from high-energy states. We found that not 
more than ~68% of incident light is accessible 
for microcrystals embedded in grease because 
of light scattering (supplementary text S2 and 
fig. S3). Accordingly, calculation of photon dos- 
ages as absorbed by the enzyme must take this 
and other factors such as the quantum yields into 
account. Nevertheless, our results indicated that 
up to 10 and 30 photons are potentially absorbed 
per FAD chromophore per pulse for the ps-ns 
and ns-us series, respectively (supplementary 
text S3), numbers that are high enough to cause 
activation of the FADH™ coenzyme to levels 
higher than the S1 state. However, electronic 
relaxation of FADH * to S1 from higher states 
than S2 is expected to proceed even faster than 
the S2—S] transition by internal conversion (24), 
that is, in the sub-ps range. Given the time- 
resolution of our TR-SFX experiments (Table 1), 
multiply excited FADH species should have 
decayed before the first catalytic event, that is, 
electron-transfer to the CPD lesion. 


Analyzing TR-SFX data for 
photolyase-mediated CPD-DNA repair 


Given the complex set of TR-SFX data, a brief — 
explanation of the methods we used for struc- 
tural analysis and the nomenclature is given 
first. For structure factor extrapolation and 
refinement, we followed previously described 
guidelines (31-34). For both time series, the 
basic conditions and main structural features 
are summarized in Table 1, whereas the de- 
tailed structural parameters are listed in tables 
S1 and S2. Each structure is given a name in 
the first column of Table 1, for example, F10ns 
and N10ns for the 10-ns snapshots, where the 
one-letter code indicates the type of pumping 
laser (F for femtosecond laser pump in the ps-ns 
series and N for nanosecond pump in the ns- 
us series), followed by the delay time after 
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eens] 
Table 1. Properties of the static and time-resolved structures of MmCPDII complexes with damaged DNA. A “-" indicates that no value was assigned. 
N/A, not applicable; n.d., not determined; PDB, Protein Data Bank. 
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*The definitions of structure names and intermediate numbers are explained in the text. Each structure has a uniqu 
(7YEM) because they correspond in N200us to the two independent complexes, A and B, of the asymmetric unit. When multiple intermediates coexist in a structure, the bolded number designates 
he major form. F250ps may also contain a contribution by Int3 based on negative density at C5-C5' only. +The TT state defines the linkage between the 5'-thymine and the 3'-thymine as the 
ollowing: T<>T, linked by cyclobutane; T_T, linked by the C6-C6' bond; T+T, not linked, but both thymines are in the active site; T/T, not linked, and only the 3'-thymine is in the active site; -T-T-, 
tHere, no dominant CPD, TT, and FAD intermediates can be defined. Hence, these structures should be 
§These are the conformations of active-site moieties 5WC and the R256 side chain. 5WC is designated as ordered (intact) or dynamic. R256 is designated as pointing 
oward 5WC or CPD. N/A is noted for 200 ps because the thymine bases have moved out of the active site. The BIR composed of amino acids D428, W431, and R441 acts as a lock. In its 
locked state, BIR stabilizes the unpaired bubble and prevents CPD flip-back before repair. Upon completion of CPD repair, flip-back of the repaired bases unlocks the BIR, displaces it, and initiates 
#C-C peak distances corresponding to the distance between the maxima of the characteris 
hymine positive density, along the C5-C5' and the C6-C6' axes. The position of the maximum was calculated as 

**The thymine bases have started to flip back to pair with adenine bases. 


both thymines have flipped back and are reannealing to adenine counterbases. 


Unlocked 


atoms, calculated as described in supplementary text S4 and fig. S13B. 


illumination (with “dark” designating a non- 
illuminated time-resolved control structure). 
Exceptions are the steady-state structures in the 
oxidized (ox/ss) and fully reduced states (Fdark, 
Ndark). Isomorphous difference electron density 
maps (AFo) (35) were calculated to decipher 
structural differences, which are designated as 
the difference between two states, AFo(Y-X). 

Major properties of each structural snap- 
shot are depicted in Table 1, which include 
features from the four main reaction loci (Fig. 
2). These properties are considered in the in- 
terpretation of each structural snapshot. For 
the ps-ns series, the focus was on CPD repair. The 
structural snapshots were hence assigned to 
match the TT status of the proposed interme- 
diates 1 to 5 (Fig. 1A) and designated as Intl to 
Int5 of the photolyase-DNA complexes (Table 1). 
For the ns-us series with its already completed 
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CPD repair reaction, structural snapshots re- 
flecting postrepair reactions are designated as 
conformational intermediates (Con1 to Con4), 
with close variants being further subdivided 
by a, b, and so on. 

Structural analysis of the TR-SFX data was 
conducted in three rounds: First, on the basis 
of mainly structural changes at CPD and FADH— 
(two of the four reaction loci; Fig. 2), each time 
point was assigned an intermediate or inter- 
mediates that play a role in the catalytic cycle. 
Next, structural changes in the other two re- 
action loci were included to fine-tune the in- 
terpretation of each time point. Finally, all 
datasets and interpretations were considered 
together along with additional supporting evi- 
dence derived from electron density analyses, 
singular value decomposition (SVD) analysis, 
and quantum mechanical (QM) calculations. 
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ic CPD repair signals, that is, the cyclobutane negative density and the 5'- 
lescribed in the Materials and methods summary and fig. S5. 
ttThese values are the average integrated negative electron densities from difference maps around the C5 and C6 


The four main reaction loci of CPD-DNA repair 
by MmCPDII 

Global structures of the MmCPDII-DNA com- 
plex, as determined by SFX, in both the oxi- 
dized and fully reduced states (ox/ss and Fdark 
in Fig. 2A and Ndark) agree well with prior 
synchrotron structures of the DNA complex 
in the oxidized states (16, 22) and MmCPDII 
SFX structures without bound DNA in all 
redox states (12). However, difference den- 
sity peaks are found at four main loci: the 
CPD structure (Fig. 2B), the FAD coenzyme 
(Fig. 2C), the active-site moieties R256 (R, Arg) 
and five-water cluster (SWC) (Fig. 2D), and the 
class II photolyase-defining BIR (Fig. 2E). 
These loci also show all prominent difference 
density map features after light-triggered DNA 
repair, though the general protein fold itself is 
not affected (fig. S4). In this section, we survey 
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A 


Fdark vs ox/ss 
Red - Ox 


Fig. 2. Static structures of MmCPDII complexes with damaged DNA and 
illustration of the four main reaction loci. Structure of Fdark (reduced, green) 
superposed with ox/ss (steady-state oxidized, gray) are shown along with 
30-contoured difference maps between the dark and oxidized states, AFo(dark-ox/ss) 
(positive peaks in cyan, negative peaks in magenta). (A) Global structures showing 
the overall fold, with the bound DNA backbone in orange. (B) Structures of the 
bound CPD. (€) Structures of the isoalloxazine ring in the oxidized (FADo,, gray) and 
fully reduced form (FADH”, gold), showing increased buckling of the FAD in the 
reduced state. (D) The 5WC/R256 locus at the active site. The interactions between 


the four main reaction loci and explain how 
the relevant data in Table 1 were obtained. 

1) CPD locus. To track CPD repair itself, we 
used two key indicators based on the charac- 
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teristic CPD difference density features. First, 
we determined the C5-C5’ and C6-C6’ peak-to- 
peak distances (designated as C-C peak dis- 
tances) between the maxima of the difference 


B Red - Ox 


gs N5 


“2928 pag 
ms 


CPD; FAD and active-site residues R256, W305, and W421; and the SWC that only 
appears in the Fdark state are also shown. In the presence of the 5WC, R256 
undergoes a conformational change, with its guanidinium moiety becoming the final 
vertex in the bipyramidal 5WC. Some interatomic distances are highlighted by 
dashed lines, with values in angstroms nearby. (E) The BIR with its D428/W431/ 
R441 locking triad, as well as R429, which mostly interacts with the unpaired bases 
dA7' and dA&' complementary to the thymine dimer. The lack of substantial 
difference electron density here suggests little conformational change related to the 
flavin's redox state for this region. 


map peaks along the axes of the C5-C5’ and 
C6-C6’ bonds (Table 1, C-C peak distance C5/C6 
and footnote §; Materials and methods sum- 
mary; and fig. S5). Secondly, we calculated the 
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average integrated negative electron den- 
sities (designated as negative densities) from 
difference maps around C5 and C6 atoms as a 
function of time (Table 1, negative densities 
C5/C6 and footnote **; and supplementary 
text S4), which is a good indicator of the accu- 
mulation of repair intermediates where the 
corresponding bond had been broken. The as- 
signed linkage and charge state of the 5’- 
thymine and the 3’-thymine is shown in Table 
1(TT-state) using a notation (T<>T, T_T, T+T, 
T/T, -T-T-) that is defined in the corresponding 
footnote. 

2) FAD locus. Subtle redox-dependent changes 
around the flavin site, including increased buckl- 
ing of the FAD upon reduction (Fig. 2C), are 
consistent with our recent studies of MmCPDII 
photoreduction (72). In that work, we showed 
by monitoring the pc and py dihedral angles, 
as defined in Fig. 1C, that the isoalloxazine 
ring of the FAD coenzyme undergoes a sequence 
of buckling and twisting motions during the 
process of photoreduction and protonation. 
In the oxidized state FAD,,, the isoalloxazine 
moiety was only very mildly buckled (pc, py: 
2.0°, 2.0°), whereas in the catalytic FADH™ 
state, the isoalloxazine moiety was strongly 
buckled (pc, py: 14.3 145°). In the semiquinone 
FADH«’ state, it moved closer to planarity (pc, py: 
4.6°, 4.8°) (12). Because FADH is transiently 
oxidized into FADH’ upon FET toward CPD 
and then recovered upon reaction completion, 
the pc and py dihedral angles may hence well 
reflect the electron flow during CPD repair 
(Table 1, FAD state and pc/py). 

3) 5WC/R256 locus. In the active site, the 
substrate interacts with several key features of 
the enzyme (Fig. 2D), including the adenine 
moiety of FAD, the highly conserved residue 
R256, and the structurally conserved 5WC. 
In the fully reduced state, that is, Fdark and 
Ndark, the 5WC bridges the bisphosphate 
backbone of FAD and R256 so that it con- 
tributes to the electrostatic stabilization of the 
anionic FADH . Upon FET, R256 moves to 
stabilize the CPD radical anion and the 5WC 
becomes disordered or dynamic. Changes in 
5WC/R256 during repair are monitored in Table 
1 6WC/R256 and footnote +). 

4) BIR locus. In the BIR (Fig. 2E), the D428- 
W431-R441 triad (D, Asp; W, Trp) acts as a lock, 
both stabilizing the unpaired bubble and firmly 
maintaining the protein-DNA complex (22), 
whereas R429 is more flexible and capable of 
dynamically interacting with both CPD com- 
plementary bases, dA7’ and dA8’. Here, post- 
repair base flip-back induced the opening of 
the BIR lock, as shown in Table 1 (BIR D428- 
W431-R441 and footnote §). 

Given our structural snapshots, the entire 
process of photolyase-mediated repair of dsDNA 
can be assigned to three periods, that is, early, 
middle, and late events, which correspond to 
CPD repair, active-site recovery, and thymine 
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back-flip/DNA release, respectively. The loci of 
CPD structure and FAD geometry will be con- 
sidered first to assess the primary repair process. 
The remaining two reaction loci will then be 
separately addressed. Considering these results 
together, we then used them to assemble a mo- 
lecular movie covering DNA repair and ordered 
product release by photolyases. 


Early events: Electron transfer—driven repair of 
the CPD lesion 


The refined structures and difference maps 
from the ps-ns series monitor both the ring- 
opening steps of the CPD repair reaction and 
the concomitant FAD isoalloxazine moiety dy- 
namics (Fig. 3A). The C5-C5’ and C6-C6’ peak 
distances as well as the integrated negative 
electron densities (Fig. 3B) and the flavin di- 
hedral angles pc and py (Fig. 3C) are plotted 
over time. The structure of each time point in 
Fig. 3A represents the predominantly accu- 
mulated species of a snapshot during CPD repair. 
Although the structural changes shown here are 
subtle, the AFc(Y-dark) of each individual struc- 
ture is always maximally correlated to its cor- 
responding AFo(Y-dark) (fig. S6, A and B), which 
supports the reliability and significance of the 
refined structural changes and yields a clear 
view of the process of DNA repair. 

Accordingly, we have observed all proposed 
CPD repair intermediates 1 to 5 (Fig. 1A) in the 
enzyme bound form as Int1/Int1* to Int5 (Table 1), 
where the Fdark structure represents Intl (I<>T/ 
FADH ). At 650 ps, the magnitude of the negative 
density feature around the C5-C5’ bond is much 
higher than that around the C6-C6’ bond. In good 
agreement, the C5-C5' peak distance is 1.9 A 
(Fig. 3, Aand B, and Table 1), which implies a 
broken C5-C5' and an intact C6-Cé6’ bond, sug- 
gesting that the CPD ring-opening intermediate 
(T_T), Int3, is dominant in the F650ps 
snapshot. At Fins, with the negative density 
around C5-C5’ being strong and that around 
C6-C6’ growing, both C5-C5’ and C6-Cé6’ 
difference density peak distances reach 1.9 A. 
Here, the dominant species corresponds to the 
repaired CPD intermediate ([+T)”, Int4. Inter- 
estingly, these trends continue to the F2ns snap- 
shot, suggesting a continuing buildup of Int4. 
This in turn suggests that Fins may still con- 
tain a contribution by Int3. 

The first structure after photon absorption 
corresponds to the activated coenzyme, FADH *. 
The F100ps snapshot (Fig. 3A) might represent 
this state because its CPD structure is unaltered, 
whereas its isoalloxazine ring has greatly 
changed. Nevertheless, the R256 side chain 
has moved toward CPD, suggesting that FET 
from FADH * has already begun at this point. 
In the F250ps snapshot, the isoalloxazine ring 
continued to swing, as shown by sign changes 
of its pc and py angles (Fig. 3C). Although the 
CPD structure still remains unchanged, the 
nearby active-site moieties R256/5WC have 


1 December 2023 


already begun to move, as described later. These 
changes suggest that FET is in progress during 
both the F100ps and F250ps snapshots, which 
represent a mixture of species, T<>T/FADH * 
(Intl*) and (T<>T)* /FADH”’ (Int2). This elec- 
tron transfer-dependent reorganization is in 
good agreement with the proposal that such 
ultrafast redox transitions are stabilized by rapid 
isolloxazine fluttering (36) and harmonic mo- 
tions of other light-gathering cofactors after 
absorption (28). 

In the F450ps snapshot, a C5-C5’ peak dis- 
tance could be measured for the first time 
because C5-C5’ negative density has accumu- 
lated. Nevertheless, in the refined structure, 
both the C5-C5’ and C6-Cé6’ bonds remained 
intact. Hence, this snapshot can be assigned to 
an Int2/3 mixture, in which Int2 is the major 
species. Interestingly, from F450ps up to Fins, 
structures showed relatively planar FAD iso- 
alloxazine geometries compared with the dark, 
fully reduced state (Fig. 3C), which is expected 
for the semiquinoid FADH’ form (72). This indi- 
cates that FET is mostly finished in the F450ps 
snapshot and that the radical anionic CPD 
species dominate these snapshots. However, im- 
mediately after the 1-ns delay (at 2 and 3.35 ns), 
there are notable oscillatory motions of the 
FAD’s isoalloxazine ring, as indicated by the pc 
and py dihedral angles, which correlate well 
with fluttering that is associated with rapid 
relaxation after sudden redox changes (7). This 
fluttering suggests back electron transfer (BET) 
from the thymine radical anion to the coen- 
zyme and is in good agreement with the BET 
time constant of ~700 ps that was determined 
during time-resolved spectroscopic studies 
of CPD repair by other photolyases (37, 38). 
Accordingly, we assign the F2ns and F3.35ns 
snapshots as representatives for the BET re- 
action, which should contain varying mixtures 
of (T+T)* /FADH’ (Int4) and (T+T)/FADH™ 
(Int5). In the F6ns snapshot, the isoalloxazine 
oscillatory motion appears to have receded, 
and therefore we propose that Féns consists 
predominantly of the product after completion 
of the CPD repair, Int5. Because the structural 
properties of the next snapshot, F10ns, are very 
similar to those of F6ns (Fig. 3A and Table 1), 
Fi0ns represents further accumulation of the 
Int5 product. 


Middle events: Base stacking and FAD 
site recovery 


The ns-us series of TR-SFX data addresses 
conformational changes of the repaired thy- 
mines and enzymatic moieties during product 
release. Throughout this series, the status of 
the intact thymines (T+T) and the coenzyme 
FADH corresponds chemically to Int5. Never- 
theless, given that the T+T and FADH  con- 
formations and active-site arrangements also 
vary with time, we describe them as distinct 
conformational intermediates. Notably, the 
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Fig. 3. Bond breaking and ring opening during cleavage of the cyclobutane 
ring. (A) Structures of the CPD moiety (T<>T) of the bound DNA as well as the FAD 
isoalloxazine moiety from the first TR-SFX series (orange), from F1OOps to F10ns, 
each superposed with the reduced dark structure (Fdark in green and FADH™ 

in gold). Each panel also shows 3o-contoured AFo(Y-dark) maps, with positive peaks 
in cyan and negative ones in magenta. (B) Plots of C5-C5' (green) and C6-Cé6' 


(orange) peak distances (full lines) and average integrated negative electron 
densities (dashed lines) over time. Because both the negative cyclobutane and 
the positive 5'-thymine peaks are necessary to calculate peak distances, when 
either was missing, the distance could not be calculated and no value is plotted 
(100 to 250 ps for C5-C5' and 100 to 650 ps for C6-C6'). (C) Plots of pc (blue) 
and py (red) dihedral angles over time. ET, electron transfer. 


N10ns snapshot (Conl, the first structure in 
Fig. 4A) and the F6ns/F10ns snapshots (Int5; 
Fig. 3A) coincide closely, thus providing fur- 
ther evidence for the adequacy of the experi- 
mental parameters used in both series. To our 
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surprise, before the onset of product release, 
the intact thymines stay fully stacked in the ac- 
tive site for up to 500 ns (Table 1 and Fig. 4A). 
Apparently, the active site undergoes relaxation 
before the two repaired thymine bases start to 
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flip back out of the active site. In the N100ns 
and N500ns snapshots, the isoalloxazine geom- 
etry of the FADH coenzyme returns to the ini- 
tial fully reduced state (Fig. 4, A and C). These 
two structures, assigned as Con2a and Con2b 
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Fig. 4. Postrepair events that lead to thymine-base return and DNA release. 
(A) Structures of the newly repaired thymines as well as the isoalloxazine moiety 
from the second TR-SFX series (orange), from NlOns, N100ns, and N500ns, each 
superposed over the reduced dark structure (Ndark CPD is shown in green, whereas 
its isoalloxazine moiety is depicted in gold). Each panel also shows 3o-contoured 
AFo(Y-dark) maps, with positive peaks in cyan and negative ones in magenta. (B and 
C) Peak distance plots along the C5-C5' (green) and C6-Cé6' (yellow) axes (B) and 
Pc (blue) and py (red) angles plots (C) over time for the second TR-SFX time series. 
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(D) Structures of the newly repaired thymines from the second TR-SFX series 
(orange) during their return to the dsDNA, including N25us and N20Ous, each 
superposed over the reduced dark structure (Ndark in green). Because the two 
MmCPDII-DNA complexes in N20Qus are at two different DNA release stages, 
they are both shown here. Each panel also shows 3o-contoured AFo(Y-dark) 
maps as in (A). (E) Difference maps at the dsDNA, from left to right: N1Ons. 
N500ns, N25us, and complex B from N20Ous. For all complexes, 3o-contoured 
AFo(Y-dark) maps are superposed over the entire DNA molecule. 
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intermediates, respectively, differ only slightly for 
the 5WC/R256 locus (see below). Interestingly 
during photoreduction of MmCPDII without 
bound DNA, the isoalloxazine ring similarly 
adopts its full buckling at about 300 ns after its 
light-driven FADH’—>FADH transition (72). 

An interesting observation from these data 
is the rapid decrease of isoalloxazine dihedral 
angles during FET but slow increase and re- 
covery after BET (Fig. 3C and Table 1). A possible 
explanation is that FET occurs between CPD 
and the light-excited FADH ™ in its S, state 
(39). The oxidation of FADH * into FADH’ is a 
downhill transition between an unstable, high- 
energy state to a stable, low-energy state, where- 
as reduction of FADH’® to FADH represents a 
transition between two ground states (Dg to So). 
Although BET transiently perturbs the FADH 
structure, as evidenced by strong torsion and 
fluttering between 1 and 3 ns (Fig. 3C), the accu- 
mulated energy within the isoalloxazine moiety 
is only slowly dispersed over hundreds of nano- 
seconds, as seen both here and previously for the 
MmcCPDII photoreduction (72). 


Late events: Base back-flip and onset of 
reannealing of dsDNA 


The isoalloxazine geometry no longer changes 
in the 25 and 200 us snapshots (Fig. 4D). Here, 
the repaired thymine bases finally move out of 
the active site and flip back toward the dsDNA. 
In acontinuation from prior time points, where 
postrepair movement concentrated around 
the 5'-thymine (Figs. 3A and 4A), it is the 5’- 
thymine (dT7) that first starts to exit the active 
site during the N25us snapshot. However, this 
back-flipping base is too disordered to be refined 
because only the pentose is well resolved (Fig. 
4D). Because the two thymine bases are no 
longer stacked and the 5’-thymine flips back, 
this structure can be considered as a “thymine 
back-flipping intermediate,” which we define 
as TT status T/T (Con3a), reflecting the non- 
stacking relationship between the two thymine 
bases. Meanwhile, the 3'-thymine dT$8 is tilted 
in such a way that it occupies the full space 
within the active site (Fig. 4D). The N200us 
snapshot is particularly interesting because the 
two thymine bases continued the back-flipping 
process but were captured at different stages 
in the two complexes in the asymmetric unit. 
In complex A, the dT7 has further moved and 
its density has now become well defined (in- 
termediate Con3b, TT status: T/T). In com- 
plex B, both thymine bases have flipped back 
and undergo base pairing with their respective 
complementary bases. This complex can be de- 
signated as the “product-reannealing complex” 
(Con4), which corresponds to regular dsDNA 
and is denoted as the “-T-T-” TT status. There 
are extra difference signals in complex A that 
can be fit to Con4 (30 to 50%) and likewise in 
complex B that correspond to Con3b (30 to 
40%) (fig. S7). This population shift between 
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complexes A and B in the N200us snapshot 
likely derives from crucial crystal contacts made 
by complex A but not by complex B (fig. S8). In 
terms of thymine back-flipping, complex B is 
hence less constrained by the crystal lattice and 
can restore the integrity of the duplex DNA ina 
shorter time frame. Considering that repaired 
dsDNA is known to dissociate from the photo- 
lyase, it is reasonable to consider Con4 as a 
transient intermediate captured just before 
final complex dissociation. Unlike all structures 
up to 500 ns that show only a few difference 
density peaks in the DNA region (Fig. 4E, left), 
the N200us complex B reveals extensive dif- 
ference density features along the entire dsDNA 
(Fig. 4E, right). 


Active-site dynamics during CPD repair: The 
5WC/R256 locus 


Concomitant to the reaction course of CPD 
repair as outlined before, further AFo peaks 
nearby hint at conformational changes in two 
highly conserved features of the class II photo- 
lyase active site: the 5WC and the side chain of 
R256 (Fig. 2D). Only in the catalytically active 
fully reduced state (Intl), but not the oxidized 
state (ox/ss), does the 5WC interact with one of 
the Nn atoms of R256 to form a square bi- 
pyramidal structure that connects the FAD phos- 
phate backbone, the CPD 3'-thymine, and R256 
(Fig. 2D). Upon reaction initiation, that is, at 
100 ps, R256 quickly retracts from the 5WC 
toward the CPD, with the side chain still being 
mobile and showing two alternative confor- 
mations (Fig. 5A). During the further reaction, 
the 5WC loses its order, whereas the R256-CPD 
interaction persists, as shown for Int3 of the 
F650ps snapshot as an example (Fig. 5B). Based 
on these structures and the corresponding struc- 
tures of all time points (fig. S9), we suggest that 
the 5WC/R256 locus allows fast reorganization 
during the electron transfer reaction. In the 
resting, that is, fully reduced, state, 5WC/R256 
electrostatically stabilizes the negatively charged 
FADH . Upon FET, R256 preferably electro- 
statically stabilizes the CPD” anion radical 
rather than the FADH’ radical and destabilizes 
the ordered 5WC. In support of this hypothesis, 
the 5WC could not be found in either the ox/ss 
(Fig. 2D) or in any of the room-temperature apo 
structures of MmCPDII (fig. S10) (12), indicating 
that for the formation of a square bipyramidal 
structure of 5WC, anionic FAD and a bound CPD 
substrate are necessary. In MmCPDII-CPD-DNA 
complexes obtained under cryogenic conditions 
at the synchrotron, a highly similar six-water 
cluster (6WC), in which the sixth water replaces 
R256, can be observed (16, 22) (fig. S10). Given 
that such a 6WC transiently appears during the 
BET, that is, in the F3.35ns snapshot (fig. S9), 
we propose that, for the synchrotron structures, 
the 6WC mimics an intermediate derived from 
either cryo-trapping or x-ray induced electron 
transfer. 
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Our results on the electrostatic stabilization 
of the CPD radical anion by R256 are corrob- 
orated by previous mutational analyses (37, 40). 
These studies showed that the corresponding 
R342A mutation (A, Ala) in Escherichia coli 
photolyase caused impaired binding for photo- 
damaged DNA and a diminished quantum yield 
for CPD repair (37). Accordingly, the 5WC/R256 
locus of class IT photolyases may act as a fail- 
safe device for FADH , priming it for FET ac- 
tivity only in the presence of a bound CPD lesion. 

The 5WC is reordered at 500 ns (Fig. 5C), 
when the isoalloxazine ring of FADH’ has 
returned to its relaxed fully reduced state. 
Accordingly, the Con2b intermediate can be 
understood as the “enzyme-recovered product 
complex” before initiation of product release. 
In the N100ns snapshot (Con2a), the 5WC adopts 
a slightly different geometry when compared 
with both the dark and N500ns states, which | 
apparently represents a structural snapshot of 
the process toward the enzyme-recovered product 
complex. The concept of enzyme recycling is 
well known, but it is notable that the enzyme 
does not rush to release the product before 
resetting the conformations of its active-site 
moieties that have been altered during catalysis. 
In this way, the enzyme may drive product 
release while being primed for cognate substrate 
recognition and the next reaction cycle. 


Role of the BIR during product release 
by photolyases 


Whereas the 5WC/R256 locus is involved in 
fine-steering the active site’s affinity for cognate 
CPD substrates and repaired products, the BIR 
acts in stabilizing the unpaired bubble of the 
bound dsDNA, that is, the DNA region vacated 
by CPD flip-out upon binding to MmCPDII 
(Fig. 2E) (22). Unlike the 5WC/R256 locus, the 
BIR locus remains essentially unchanged dur- 
ing the first 500 ns (Fig. 5D and fig. S11) but 
begins showing slight changes for the N25us 
snapshot (Con3a) and then, as expected, large- 
scale changes in both complexes of the N200us 
snapshot, where the two thymines have partially 
entered the unpaired bubble (Con3b and Con4 
in Fig. 5, Eand F; and figs. S7 and S11). As the 
5'-thymine first flips back toward the unpaired 
bubble (Figs. 4D and 5E), the resulting rota- 
tion of the intra-CPD phosphate causes BIR 
residue R441 to swivel away from the phos- 
phate by changing its center of mass distance 
from 4.4 to 7.2 A (Fig. 5E). This opens the BIR 
“lock” and allows both thymines to return se- 
quentially to the unpaired bubble. Once filling 
the unpaired bubble, the two thymine bases 
stack well with the preceding dC6 base and 
interact with their complementary bases dA7’ 
and dA8’, essentially displacing D428 and R429, 
while R441 moves back closer to the dT8 phos- 
phate, stabilizing it in its new orientation (Fig. 
5F). However, the remaining BIR element W431 
continues interacting with dC9 and the thymine 
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Fig. 5. Roles of the 5WC/R256 locus during DNA repair and the BIR locus 
during DNA release. (A to C) Detailed structures of the SWC/R256 locus 
[orange, Fl00ps for (A), F650ps for (B), and N500ns for (C)] superposed with 
the corresponding dark structure (green). The FAD coenzyme is depicted in gold 
for both structures in each panel. The R256 side chain is dynamic, with two 
alternative conformations at 100 ps. 3o-contoured AFo(Y-dark) maps are shown 
as well, with positive peaks in cyan and negative ones in magenta. The characteristic 
dark 5WC is shown as green spheres, whereas the remaining water molecules for 
each specific time-resolved snapshot are in red. Here, blue dashed lines highlight the 
5WC arrangement in each time-resolved snapshot. Notably, already at 100 ps (A), 
the square bipyramidal interaction between 5WC and R256 is broken, resulting in a 


interatomic distances between the center of mass of the R256 guanidinium moiety 
and the O2 carbonyl of each of the CPD thymines highlighted with black dahsed 
lines. Distances in angstroms are provided nearby and are color-coded to match the 
structures. (D to F) Detailed view of the unpaired bubble and the BIR residues D428, 
R429, W431, and R441 with Ndark (green) superposed onto three individual time- 
resolved snapshots [N500ns (D), N200us complex A (E), and N200us complex B 
(F)]. dT7 and dT8, corresponding to the CPD bases, as well as all BIR residues are 
shown as stick models, whereas all other bases and amino acids are shown as 
cartoon representations. 3o-contoured AFo(Y-dark) maps as above are also shown. 
To highlight BIR lock opening, marked by the movement of R441, the R441 to CPD 
phosphate center-of-mass distance is shown as a red dashed line, with color-coded 


square pyramida 


bases, which rationalizes why the repaired DNA, 
after losing so many interactions at the active 
site, still sticks to the enzyme before full 
release and why the bound and still-kinked 
DNA at this point is highly dynamic, as shown 
by the large number of diffuse difference den- 
sity peaks around the DNA backbone (right 
structure in Fig. 4E). Indeed, clustering anal- 
ysis of steered molecular dynamics simulations 
restrained by the AFo(200us-dark) electron 
density map revealed that dsDNA in complex 
A adopts a single major conformation simi- 
lar to the starting structure (fig. S12). By con- 
trast, the dsDNA of complex B showcases two 
major conformations with different degrees 
of deformation in the unpaired bubble re- 
gion, supporting that DNA release starts at the 
unpaired bubble and extends from it toward 
the upstream and downstream regions of the 
DNA (fig. S12). 

Upon full DNA release, even this last protein- 
DNA interaction will be broken, and dC9 and -T-T- 
co-stack again as in canonical dsDNA. It is 
feasible that CPD-DNA binding by MmCPDII 
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arrangement. R256 and the CPD are shown as stick figures, with 


follows the reverse sequence of events: First, 
W431 intercalates between the 3’-thymine of the 
CPD lesion and its downstream base, enforcing 
suboptimal base stacking on both sides of the 
CPD lesion. Then, an onset of interactions be- 
tween R429 and the counterbases dA7’ and dA8’ 
(16) supplies further stabilization. Finally, 
the R441-D428 lock facilitates CPD phosphate 
rotation, thus causing the CPD to flip into the 
enzyme’s active site. 


Complementary approaches to validate the 
TR-SFX-derived photolyase mechanism 


To provide further support for our proposed 
structural mechanism, we performed three 
additional analyses on the TR-SFX data as a 
whole, with details provided in supplementary 
texts S4 to S6. First, we developed a kinetic 
model (supplementary text S4 and fig. S13A) 
based on the accumulation of negative den- 
sities around the two affected bonds, C5-C5’ 
and C6-C6’, during the first 3.35 ns of the re- 
action (Fig. 3B and fig. S13B). By numerical 
integration of this model, we could deter- 
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distances in angstroms provided nearby. 


mine that indeed Int3, that is, (T_T), was the 
dominant species between 450 and 650 ps, 
whereas a mixture of Int4 and Int5 became 
dominant after 1 ns (fig. S13C), as described by 
our molecular structures (Fig. 3A). These re- 
sults show that the kinetic data derived from 
the negative density accumulation correlate 
well with the derived intermediate structures 
and thus produce a valid reaction mechanism 
(fig. S13D). 

Second, we performed SVD analysis of both 
the ps-ns and ns-us time series as described in 
supplementary text S5 and figs. S14 and S15. 
For the ns-ps series, the first main component 
(SV1) acted by symmetrically separating the 
CPD bases, whereas the second one (ISV2) 
dampened the effect of the first one in the vi- 
cinity of C6-C6', effectively delaying their sepa- 
ration (fig. S14, A and B). Meanwhile, SVD 
results for the ns-us series agreed very well 
with our molecular structures (Fig. 4 and fig. 
S15) and supported a three-state mechanism 
for DNA release: (i) local CPD changes prev- 
alent at 100 to 500 ns (ISV1), (ii) global DNA 
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changes during complex release at 200 us (SV2), 
and (iii) base flip-back between 25 and 200 us 
(SV3). Because SVD is performed over the en- 
tire asymmetric unit, including complexes A 
and B, we performed further structural analy- 
ses to show that complexes A and B of 200 us 
involve shifts of populations (fig. S7). 

Third, we sought to validate our interpreta- 
tions of time-dependent structural changes with 
QM computational analysis by using the density 
functional theory method to calculate the highest 
occupied molecular orbitals of the TR-SFX- 
derived atomic coordinates of CPD repair inter- 
mediates (supplementary text S6 and fig. S16A). 
Our data showed overlapping orbitals up to 
650 ps (Int3) and formation of a node at 1 ns 
(Int4; fig. S16B), which supports complete 
rupture of the cyclobutane ring between 650 ps 
and 1 ns. Furthermore, when the calculation 
was repeated with addition of a negative 
charge to CPD, node formation then occurred 
at 3.35 ns, that is, BET involving Int4 and Int5 
(fig. S16C), which reflects the extra time needed 
for BET. 


Discussion 
Comparing TR-SFX data with kinetic data from 
spectroscopic studies of photolyases 


Even though this work focused on structural 
details of the intermediates and processes during 
catalysis of DNA repair by a class II photolyase, 
the kinetics and order of occurrence of inter- 
mediate species should resemble that from spec- 
troscopic studies in solution, though detailed 
data can differ because of different experimental 
conditions and sample sources. Overall, our re- 
sults are in close agreement with results of 
previous ultrafast time-resolved spectroscopic 
studies, wherever a comparison can be made, 
such as the chemical structures of the inter- 
mediates of cyclobutane repair (J, 4, 38), and 
the roles of active-site residues such as R256 
and hydration dynamics (37). The role of primary 
hydration dynamics in stabilizing and mediating 
the charge transfer reactions, which had been 
previously hypothesized based on ultrafast spec- 
troscopic analysis of the DNA repair reaction (4), 
has been now given a molecular and structural 
basis. Here, the R256-5WC locus plays a key role 
in stabilizing the negative charge of the electron 
that travels back and forth between the FADH™ 
coenzyme and the CPD lesion. 

Furthermore, we can also address some of 
the differences in the kinetics and quantum 
yields reported previously for different photo- 
lyases. For example, our data clearly indicate that 
the occupancy of activated complexes during 
CPD repair stays constant from 100 ps to 10 ns 
(table S1). This result signifies that once FET has 
occurred, the reaction proceeds with almost 
100% efficiency and, conversely, that the ob- 
served quantum yields for repair depend on 
nonproductive deexcitation of FADH *. Another 
possible culprit for lowered quantum yields is 
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the proposed intramolecular electron transfer 
GET) between the FADH  isoalloxazine and 
adenine moieties (20, 41). We believe that the 
latter is a plausible explanation, although our 
data showed no obvious time-dependent con- 
formational changes around the adenine moiety. 
iET is part of the FET pathway in class II 
photolyase-mediated DNA repair but is not 
predominant in class I photolyase-mediated 
repair (20). This hypothesis, which also implies 
that iET is completed within the overall FET 
time frame of about 100 to 250 ps, fits well 
with the comparatively low quantum yield of 
MmcCPDII (~25%; fig. S18) compared with yields 
of 45 to 100% reported for class I photolyases 
(19, 42). These results suggest that different 
photolyases, despite high structural homology, 
differ in their kinetic schemes, which is also 
reflected by the variations of kinetic constants 
and quantum yields depending on class, species, 
substrate, and reaction conditions (20, 38, 41, 43). 
Because our TR-SFX analysis of MmCPDII 
simultaneously examined multiple reaction loci, 
as well as chemical and conformational prop- 
erties, the order of events found in this study 
should reflect the events across a realistic 
timescale, though it was not our intent to 
determine detailed kinetic constants. Never- 
theless, our finding that BET occurs 1 to 2 ns 
after photon absorption and FET (Fig. 3C) is 
comparable to findings of previous reports 
about class II photolyases, where (T+T)*” ap- 
peared ~600 ps after photon absorption and 
BET was completed 850 ps later (20). 

Prior spectroscopic data had suggested that 
bond breaking occurred sequentially, with 
C5-C5’ breaking first, followed by C6-C6’ (Fig. 
1A) C, 2, 4, 7, 40). Our results have validated 
this mechanism and characterized the chem- 
ical structures of the ring-opening interme- 
diates in the enzyme-bound form. 


The issue of multiphoton effects on DNA 
repair by photolyases 


Although it is conceivable that FAD dynamics, 
and possibly the resulting FET kinetics, is 
affected by multiphoton excitation, the focus of 
this work is the repair of DNA that contains 
CPD lesions, which should be minimally af- 
fected, if at all, by multiphoton effects on the 
basis of the following mechanistic and experi- 
mental considerations: First, the structural moiety 
being repaired, a CPD, does not interact at 
400 nm with the excitation light. Initiation of 
its repair only requires an electron through FET 
from the excited FADH *. Our results show that 
FET is completed in ~250 ps, whereas CPD 
repair takes place mainly from 0.45 to 1.0 ns, 
and the repair mechanism itself should be un- 
affected by the mechanism and kinetics of FET. 
This conclusion is further supported by relevant 
literature and additional control experiments: 
Previous spectroscopic studies showed that elec- 
tronic relaxation of free FADH * proceeds within 
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5 to 10 ps but takes about 1.8 ns when bound to 
free photolyases (6), indicating that there is 
sufficient time to generate higher states than 
S1 by excitation of FADH *. However, relaxation 
of FADH * from S2 and higher states to S1 
occurs within a few ps (24) or even faster by 
internal conversion. Even if multiple photon 
absorption by FADH may cause high-energy 
states, which eject a free electron by reaching 
the ionization continuum, this process might 
be a biologically feasible FET mechanism. In the 
presence of the 8-hydroxy-deazaflavin antenna 
chromophore, light energy absorbed by the 
antenna is predicted to cause free-electron 
ejection from FADH by a process called inter- 
molecular coulombic decay (44). Although the 
effects on quantum yield cannot be discounted, 
effects on the FET speed appear rather unlikely 
according to our TR-SFX data, because previ- 
ously published FET time constants from time- : 
resolved spectroscopy range between 200 and 
600 ps, for example, 209 ps for a class I and 
565 ps for a class II photolyase (20). Thermal 
effects caused by vibrational cooling upon back- 
conversion of high-energy states of FADH * are 
also unlikely to affect catalysis because we did 
not observe substantial difference map peaks 
in the CPD environment of the photolyase-DNA 
complex in its oxidized state under multiphoton 
conditions (fig. S1, A and B). 

Second, the interpretation of our results is 
based on multiple active-site moieties (Fig. 2, A 
to E) that all display different time-dependent 
changes. As mentioned above, CPD repair is by 
orders of magnitude slower than multiphoton 
relaxation. Based on Fig. 6A, the recovery of 
active-site residue conformations (500 ns) and 
the thymine back-flipping and product return 
(us range) are even slower and thus unlikely to 
be affected by any multiphoton effect. 

Third, it is interesting to note the high degree 
of correlation between the CPD AFo features in 
the Fl0ns and N10ns snapshots (see Materials 
and methods summary). Because simultaneous 
multiphoton excitation may occur in the ps-ns 
series (pulse duration: 0.98 ps) but not in the 
ns-us series because of long pulse duration 
(3 ns), as addressed in supplementary text S3, we 
would expect the Fl0ns and NiOns snapshots 
and their difference maps to differ considerably 
if multiphoton excitation affected repair kinetics 
in the ps-ns series. 

Finally, as described in the beginning of the 
Results section, the power titration analysis 
suggested that the power used in our experi- 
ments is just enough, in contrast to the cal- 
culated values that suggest excessive photons. 
This disparity has also been reported recent- 
ly (45), which suggests that far fewer pho- 
tons than the nominal values are absorbed 
for reasons that require further investigation. 
An alternative scenario, especially for proteins 
optimized for light-driven electron transfer, is 
that “proteins may have evolved to direct all 
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Fig. 6. The complete timeline for major events and video presentation. 
Evolution of the average C5-C5' and C6-C6' atomic distances of the CPD and 
dihedral angles of the isoalloxazine ring of FAD, as derived from our structural 
models. The intermediate number or numbers for each time point, and the catalytic 
hallmarks of different time periods, are indicated. The intermediates up to 10 ns 
are reaction intermediates for CPD repair (Intl to Int5, indicated as I1 to 15), whereas 
the intermediates from 10 ns upward are conformational intermediates (Conl 


deposited energy toward functional outcomes” 
(25, 29). 


The issue of identifiable reaction intermediates 
from TR-SFX experiments 


Many TR-SFX reports have focused on dynamic 
features of a specific chemical or physical step 
(28, 46-48), and others have used kinetic mod- 
els from spectroscopic studies to guide their 
search for reaction intermediates (49). By con- 
trast, our work deals with an entire catalytic 
cycle with many elementary steps not assigned 
previously, such as ordered product release. 
Intermediates of chemical or enzymatic reac- 
tions are commonly considered stable when 
detectable by kinetic or biophysical methods. 
These methods are usually limited by their 
capacity to detect characteristic signals for 
each intermediate moiety and thus provide 
mostly local information about the reactants 
themselves. Notably, the International Union 
of Pure and Applied Chemistry (TUPAC) Gold 
Book (50) defines an intermediate as “a mo- 
lecular entity with a lifetime appreciably longer 
than a molecular vibration that is formed from 
the reactants and reacts further to give the 
products of a chemical reaction.” Because TR-SFX 
captures global information about all atoms 
within the enzyme-substrate complex, in our 
case, a photolyase bound to the biopolymeric 
CPD-DNA, it should be able to unravel multi- 
ple intermediates, or molecular entities, in a 
chemical step or a process (e.g., conformational 
change, product release), particularly because 
they can be stabilized by the enzyme. 
Conversely, because each time point may 
consist of multiple intermediates with vary- 
ing populations, each snapshot listed in 
Table 1 represents either the predominant 
species of that particular snapshot or a mix- 
ture of species. Importantly, we based our 
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assignments not only on the refined structural 
models but also on the features of the AFo 
maps, such as C-C peak distances and the 
accumulation of negative densities along the 
relevant bonds (Fig. 3B). These assignments 
were then further validated by other relevant 
structural changes at the active site, such as 
the status of the dihedral angles of the FAD 
and the structural movements of the 5WC/ 
R256 locus (Table 1), and by different experi- 
mental approaches (see supplementary texts 
S4 to S6). 

The reliability and importance of the refined 
structural changes are supported by correla- 
tion coefficient analysis (fig. S6), which quan- 
tifies the similarity between the observed and 
calculated difference electron density maps. 
Especially within the initial 2 ns, where prior 
solution spectroscopic studies predicted that 
most chemical changes would occur (5, 20), 
individual structures strongly correlate with 
their corresponding density maps but not with 
those of their neighboring time points. 

A conformational change is commonly con- 
sidered as a simple two-state process, where a 
time-dependent structural change corresponds 
to a shift of population between two states. In 
our view, a conformational change could be a 
continuous or multistep process, as demonstrated 
recently by a temperature-resolved cryo-electron 
microscopy study of a ligand-induced conforma- 
tional change of another enzyme (57). In this 
work, we found that the active site of the en- 
zyme goes through a multistep process (10 to 
500 ns) to recover from catalysis, and we iden- 
tified two variants, Con2a and Con2b, in the 
process. For the product release, we also iden- 
tified two variants (Con3a and Con3b) in the 
process of back-flipping of the repaired thymines 
and characterized the reannealed product Con4. 
Increased conformational freedom around the 
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10,000 


200,000(A) 200,000(B) 


to Con4, indicated as C1 to C4) in the processes of active-site recovery and 
product release. Importantly, because the refined structural models may represent an 
average structure derived from the composite of different intermediates that are 
present at any given time point, the evolving C5-C5' and C6-C6' distances should 
not be simply considered as bond elongation but rather as shifts in the populations 
of bond split-unsplit intermediates. In any case, these values visualize the progress 
along the reaction coordinate for CPD cleavage over time. 


DNA also leads to the shift of population be- 
tween Con3b and Con4 (figs. S7 and S8). 

For the CPD repair in the ps-ns series, the 
structure from a snapshot often consists of 
multiple intermediates. Those with distinct 
structures in a predominant state can be well 
characterized, such as the ring-opening inter- 
mediates in the F650ps (C5-C5’ cleaved) and 
Fins (C6-C6’ also cleaved) snapshots. Those of 
the electron transfer processes (FET and BET) 
are not directly resolvable by TR-SFX before 
the occurrence of the next step. However, the 
electron transfer processes can be indirectly 
identified from changes in the coenzyme and 
active-site residues. 


Molecular movie of photolyase catalysis 


Our results described in this work are sum- 
marized in an overall timeline (Fig. 6) and ina 
molecular movie (Movie 1) that enables visu- 
alization of the structural basis of (photo) 
enzymatic catalysis for this multistep reaction. 
The FET reaction leads to the accumulation of 
ring-opening intermediates, followed by BET, 
postrepair recovery of enzyme and coenzyme, 
and thymine back-flip and DNA release. The 
events of CPD repair are mediated by struc- 
tural fluctuations of the FAD coenzyme and 
the surrounding active-site moieties, including 
the 5WC/R256 locus, whereas thymine back- 
flip and the onset of DNA release proceed along 
the BIR region of the unpaired bubble. We also 
observed the directionality of DNA repair, that 
is, the 5'-thymine moves away from the 3'-thymine 
after the opening of the CPD ring so that the 
5'-thymine leaves the active site first. 


Materials and methods summary 
Sample preparation 


MmcCPDII-DNA cocrystals were grown accord- 
ing to previously published conditions (16, 22) 
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MmCPDII-DNA complex 


Movie 1. 3D molecular movie of photolyase-catalyzed DNA repair. Key structural components of all 
intermediates are highlighted in Fig. 6 and Table 1. As also explained in Fig. 6, the evolving C5-C5' and C6-Cé' 
distances reflect shifts in the populations of the intermediates in the opening of the cyclobutane ring, not 


the elongation of bonds. 


after applying further optimizations for enzyme 
activity and large-scale microcrystal produc- 
tion. Here, the MmCPDI protein solution was 
directly activated before crystallization by photo- 
reduction in the presence of dithiothreitol 
(DTT), white light, and anaerobic conditions, 
followed by mixing with CPD-containing DNA. 
After an incubation period of ~2 hours at 23°C, 
crystals were harvested, mixed with grease, and 
loaded into injectors according to previously 
established protocols (52). To preserve enzy- 
matic activity, all steps were performed under 
anaerobic conditions in a Coy Labs vinyl 
anaerobic chamber. Further, to prevent acciden- 
tal DNA repair, all experimental steps after 
enzyme photoreduction were performed under 
safety light (640 nm). To confirm the enzyme 
status, in-solution reoxidation was followed 
by UV-Vis absorption spectroscopy (fig. S17A), 
which showed that MmCPDII remained in the 
fully reduced state under experimental condi- 
tions for at least 24 hours after photoreduc- 
tion. Although the presence of photodamaged 
DNA is known to further stabilize FADH in 
photolyases, we further confirmed MmCPDII in 
crystallo status by in crystallo UV-Vis absorption 
spectroscopy of 20-hour-old crystals (fig. S17B). 
No crystals older than 20 hours were used for 
any of the experiments presented in this work. 
Further details about sample preparation are 
provided in the supplementary methods. 


Data collection 


At SACLA (53), data were collected at the BL2 
beamline using a 30-Hz pulse frequency and 
10-keV x-ray with a pulse duration of <10 fs and 
focused to a focal spot with a diameter of 1.5 um. 
For time-resolved experiments, a 15-Hz, 150-wJ, 
408-nm optical parametric oscillator (OPO) 
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pump laser with a 3-ns pulse length with a 
100-.m diameter focal spot was used. During 
all experiments, the DAPHNIS (54) chamber 
was flooded with a 98% helium atmosphere to 
increase the signal-to-noise ratio and to pre- 
vent oxidation of the sample. Samples were 
extruded at 2.6 ul/min through a 75-um nozzle 
(55). Images were collected on a short-working 
distance octal multiport detector with a 
50-mm sample-to-detector distance (56). Ini- 
tial light and dark splitting as well as initial 
processing was performed by the Cheetah 
pipeline (57). 

At SwissFEL (58), the XFEL was also 1.5 um 
in diameter with a ~20-fs pulse duration but was 
run at 100 Hz. For time-resolved data collection, 
the in-house laser was set for a pulse duration 
of 0.98 ps, with a pulse energy of 10 uJ and a 
focal spot diameter of 50 um running in a 3:1 
setup. Accordingly, for every three light images, 
one dark image was collected. Sample was ex- 
truded at 5 ul/min through a 75-um nozzle 
while the chamber was maintained at 200 mbar 
after flooding and evacuating three times with 
helium. 

The average crystal size was 50 um by 50 um 
by 50 um, with a 1/e penetration depth of 
103 um at 400 nm. Further details and control 
experiments for light contamination and laser 
power can be found in the supplementary 
methods, supplementary texts 1 to 3, and 
figs. S1 to S3. 


Processing, refinement, and analysis 


Data processing with CrystFEL (59, 60) and 
refinement procedures are described in detail 
in the supplementary methods. Briefly, we fol- 
lowed previously described guidelines (31-34) 
to perform structure factor extrapolation fol- 
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lowed by structural refinement (tables S1 and 
82 for refinement statistics). Additionally, two 
features were given particular attention, namely 
the exact structure of the CPD cyclobutane ring 
(Fig. 1A) and the dihedral angles pc and py of 
the FAD isoalloxazine moiety (Fig. 1C). To ac- 
curately determine the crucial CPD and FAD 
geometries, we performed real-space correlation 
coefficient-based refinement of both features. 
Here, we searched for maximum correlation 
between a AFo(Y-X) map and an equivalent 
calculated difference map [AFc(Y-X)] based on 
a fully refined static structure (X) and the time- 
resolved structure to be refined (Y). This approach 
produces a quantitative quality parameter, cor- 
relation coefficient, which can be compared for 
a given structure versus all experimental data 
to quantify how well small structural changes 
can be monitored in our time courses (fig. S6). 
The occupancies we achieved by light-triggering 
(tables S1 and S2) are close to the experimental 
quantum yield of 24 + 1% for DNA repair by 
MmCPDIHU (fig. S18 and table S3) for the ns-us 
series (20 to 22%) and about half of that for 
the ps-ns series (11 to 13%). 

The asymmetric unit contains two complexes. 
Unless otherwise stated, the maps shown in the 
figures are always from complex A, because 
those of complex B are much noisier than those 
of complex A (I/o values of 3 or below). 


Consistency between F series and N series 


To control for consistency between the ps-ns 
and ns-us series, we collected the 10-ns time 
point both at SwissFEL and SACLA. Two im- 
portant factors need to be considered when 
comparing these two 10-ns snapshots. The first 
one is resolution. Given the lower resolution of 
N1Ons (2.7 A), atomic positions are less well 
defined than those in the high-resolution 
counterpart F10ns (2.15 A). Secondly, and more 
importantly, is pulse duration. For the Fl0ns 
snapshot, each crystal is illuminated for 0.98 ps, 
whereas for N10ns, illumination is for 3 ns. 
Accordingly, reaction coherence is higher in 
F10ns than in N10ns. For example, the variance 
for the ps-ns series is expected to be +0.001 ns, 
that is, the pump pulse duration. Meanwhile, in 
the ns-us series, the variance could amount up 
to +3 ns, that is, an error of 30% relative to 
N10ns, which decreases for longer delay times. 
Accordingly, the N10ns snapshot corresponds 
more to a mixture of the Fé6ns and Fl0ns 
snapshots, as implied by the finding that their 
correlation coefficients around the CPD are 
about the same (N10ns/Fé6ns: 0.62; N10ns/ 
F10ns: 0.61, Féns/F10ns: 0.67). 


Calculation of C-C peak distances 

Bias-free monitoring of distances between peaks 
from difference maps is more reliable than 
atom-atom distances from refined structural 
models in tracking structural changes by TR-SFX. 
In our study, only the 5’-thymine, and not the 
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3'-thymine, appears to move upon opening of 
the cyclobutane, leading to mainly negative 
peaks for the cyclobutane and positive peaks at 
the 5’-thymine side. To determine the distance 
between the negative cyclobutane and the pos- 
itive 5'-thymine difference density peaks along 
the C5-C5’ and C6-C6’ bonds, AFo maps were 
generated with the phenix isomorphous dif- 
ference tool as described in the supplementary 
materials and as shown in Figs. 3A and 4. The 
CCP4: tool fft was then used to extract electron 
densities at a 2.50 contour level. To ensure that 
an identical number of voxels existed in each 
map even at different resolutions, the grid was 
set to 1/4: times the maximum resolution. Next, 
a Python script was used first to position a 
dummy atom on the C5 and C6 positions and 
then to move them along the C5-C5’ and the 
C6-C6' bond axes in steps of 0.385 A (half the 
atomic radius of carbon) for 11 steps (4.235 A) 
toward, and beyond, the 5’-thymine. In both 
cases, the dummy atom was then used as a mask 
to extract the integrated electron density at each 
position and plotted versus distance (fig. S5). The 
distance between the minimum (cyclobutane 
peak) and maximum (5’-thymine peak) along 
each axis was then extracted from the plots. 
Importantly, if either a minimum or a max- 
imum could not be determined, no peak distance 
could be calculated and therefore no value was 
assigned [noted as a dash (-) in Table 1] or 
plotted (Figs. 3B and 4B). 
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INTRODUCTION AND RATIONALE: The drastic 
morphological differences between maize and 
its wild relatives gave rise to more than a cen- 
tury of debate about its origins. Today, the most 
widely accepted model is also the simplest— 
maize was domesticated once from the wild 
annual grass Zea mays ssp. parviglumis in 
the lowlands of southwest Mexico. More recent- 
ly, however, genomic surveys of traditional maize 
varieties in both Mexico and South America have 
identified evidence for gene flow from a second 
wild relative, Zea mays ssp. mexicana, a weedy 
annual grass adapted to the central Mexican 
highlands. These results, combined with long- 
standing archaeological evidence of hybridiza- 
tion, challenge the sufficiency of a simple model 
of a single origin. 


RESULTS: To elucidate the genetic contribu- 
tions of Zea mays ssp. mexicana to maize, we 
analyzed >1000 wild and domesticated genomes, 
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including 338 newly sequenced traditional vari- 
eties. We found ubiquitous evidence for ad- 
mixture between maize and Zea mays ssp. 
mexicana, including in ancient samples from 
North and South America, diverse traditional 
varieties, and even modern inbred lines. These 
results are mirrored in a genotyping survey of 
>5000 traditional varieties representing maize 
diversity across the Americas. The only maize 
sample surveyed that lacks strong evidence for 
admixture with Zea mays ssp. mexicana is a 
single ancient South American sample N16, 
dating to ~5500 years before present. 

We next fit graphs of population history to our 
data, revealing multiple admixture events in the 
history of modern maize. On the basis of these 
results, we propose a new model of maize ori- 
gins, which posits that, some 4000 years after 
domestication, maize hybridized with Zea mays 
ssp. mexicana in the highlands of central Mexico. 
The resulting admixed maize then spread across 
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Admixture mapping 


Admixture analysis reveals widespread contributions of two teosintes to modern maize. (A) Proportion of 
highland teosinte admixture for traditional maize varieties across the Americas. (B) Admixture graph representing 
our model of maize evolution. (C) Cartoon depiction of proposed maize domestication and dispersal. 

(D) Characterization of admixture tracts along maize genomes. (E) Admixture for cob weight reveals a 


peak on chromosome 1. 
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6 
the Americas, replacing or hybridizing with ie 
existing populations. The timing of this seconL—, 
dispersal is roughly coincident with archaeolog- 
ical data showing a transition to a staple maize 
diet in regions across Mesoamerica. 

We then explored variation in ancestry along 
the maize genome. We found that 15 to 25% of 
the genome could be attributed to Zea mays 
ssp. mexicana ancestry. We identified regions 
in which Zea mays ssp. mexicana alleles had 
reached high frequency in maize, presumably 
as a result of positive selection. We investigated 
one of these adaptive introgressions in more 
detail, using CRISPR-Cas9 knockout mutants 
and overexpression lines to demonstrate the 
role of the circadian clock gene ZmPRR37a in 
determining flowering time under long-day 
conditions. Our results suggest that introgres- 
sion at this locus may have facilitated the adap- 
tation of maize to higher latitudes. ; 

Finally, we explored the contributions of 
Zea mays ssp. mexicana alleles to phenotypic 
variation in maize. Admixture mapping iden- 
tified at least 25 loci in modern inbred lines 
where highland teosinte ancestry associates with 
phenotypes of agronomic importance, from oil 
content to kernel size and disease resistance, ‘ 
as well as a large effect locus associated with 
cob diameter in traditional maize varieties. We 
then modeled the additive genetic variance of 
each phenotype, allowing us to estimate that ‘ 
Zea mays ssp. mexicana admixture explained 
a meaningful proportion of the additive genetic 
variation for many traits, including 25% of the 
variation for the number of kernels per row 
and nearly 50% of some disease phenotypes. 


CONCLUSION: Our extensive population and 
quantitative genetic analysis of domesticated 
maize and its wild relatives uncovered a sub- 
stantial role for two different wild taxain making 
modern maize. We propose a new model for the ‘ 
origin of maize that can explain both genetic and 
archaeological data, and we show how variation 
in Zea mays ssp. mexicana is a key component of 
maize diversity, both at individual loci and for 
genetic variation underlying agronomic traits. 

Our model raises a number of questions about 
how and why a secondary spread of maize may 
have occurred, but we speculate that the timing 
of admixture suggests a possible direct role for 
hybridization between maize and Zea mays 
ssp. mexicana in improving early domesticated 
forms of maize, helping to transform it into the 
staple crop we know today. 


The list of author affiliations is available in the full article online. 
*Corresponding author. Email: rossibarra@ucdavis.edu (J.R.-I.); 
yjianbing@mail.hzau.edu.cn (J.Y.); ningy@mail.hzau.edu.cn (N.Y.) 
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The origins of maize were the topic of vigorous debate for nearly a century, but neither the current 
genetic model nor earlier archaeological models account for the totality of available data, and recent 
work has highlighted the potential contribution of a wild relative, Zea mays ssp. mexicana. Our population 
genetic analysis reveals that the origin of modern maize can be traced to an admixture between 
ancient maize and Zea mays ssp. mexicana in the highlands of Mexico some 4000 years after 
domestication began. We show that variation in admixture is a key component of maize diversity, both 
at individual loci and for additive genetic variation underlying agronomic traits. Our results clarify the 
origin of modern maize and raise new questions about the anthropogenic mechanisms underlying 


dispersal throughout the Americas. 


he domestication of crops transformed 

human culture. For many crops, the wild 

plants that domesticates are most closely 

related to can be readily identified by 

morphological and genetic similarities. 
But the origins of maize (Zea mays subsp. mays) 
have long been fraught with controversy, 
even with its global agricultural importance, 
ubiquity, and extended scrutiny as a genetic 
model organism. Although there was general 
agreement that maize was most morphologi- 
cally similar to North American grasses in the 
subtribe Tripsacinae (1, 2), none of these grasses 
bear reproductive structures similar to the maize 
ear, in which seeds are exposed along a com- 
pact, nonshattering rachis. The form is so radi- 
cally distinct from its relatives that the maize 
ear has been called “teratological” (3) and a 
“monstrosity” (4). 

Explanations for the ancestry of maize have 
long been contentious (5). A model popular for 
much of the 20th century, based on extensive 
evaluation of the morphology of archaeologi- 
cal samples, argued that modern maize was 
the result of hybridization between a now- 
extinct wild maize and another wild grass 
(6). This archaeological model, however, fails 
to explain cytological (7) or genetic (8, 9) data 
showing that maize is most closely related to 


the extant wild grass Zea mays ssp. parviglumis 
(hereafter parviglumis). Today, the most widely 
accepted model is also the simplest—maize was 
domesticated from a wild annual grass in the 
genus Zea, commonly known as teosinte. This 
idea, originating with Ascherson (70) and cham- 
pioned by George Beadle throughout the 20th 
century (4, 7), became firmly cemented in the 
literature after genetic analysis revealed clear 
similarities between maize and teosinte (8, 9, 17). 
Nonetheless, this simple genetic model is in- 
sufficient to explain disparities between gen- 
etic and geographic overlap between maize 
and parviglumis (12) or morphological sup- 
port for admixture in archaeological samples 
(13-15). 

Much of the early work on maize origins was 
complicated by the relatively poor characteri- 
zation of the diversity of annual teosinte (J6). 
In addition to the lowland parviglumis, the other 
widespread annual teosinte is Z. mays ssp. 
mexicana (hereafter mexicana), found through- 
out the highlands of Mexico. These taxa diverged 
30,000 to 60,000 years ago (17, 18) and show 
clear morphological (19), ecogeographic (20, 27), 
and genetic (22, 23) differences, as well as local 
adaptation along elevation (24). In contrast to 
the overall genetic similarities between maize 
and parviglumis, some early genetic studies 


identified sharing of alleles between mexicana 
and highland maize (25), a result confirmed by 
extensive genome-wide data (26, 27). Maize and 
mexicana co-occur in the highlands of Mexico, 
but recent work has revealed mexicana ances- 
try far outside this range, including in ancient 
maize from New Mexico (28), modern samples 
in the Peruvian Andes (29), and individual al- 
leles apparently selected broadly in modern 
maize (30, 31). 


Admixture with mexicana is ubiquitous in 
modern maize 


Archaeological data suggest that after its ini- 
tial domestication in the lowlands of the Balsas 
River basin, maize was introduced to the high- 
lands of central Mexico by ~6200 cal BP (calen- 
dar years before present) (32), where it first 
came into sympatry with mexicana. By this 
time, however, maize had already reached 
Panama (by ~7800 cal BP) (33) and even farther 
into South America (by ~6900 to 6700 cal BP) 
(34-36). Samples from South America that 
reflect dispersal events before maize coloniza- 
tion of the Mexican highlands should thus not 
exhibit evidence of admixture with mexicana. 
Indeed, tests of admixture find no evidence 
of mexicana ancestry in N16, a ~5500 cal BP 
maize cob from northern Peru (37). 

To investigate evidence of mexicana admix- 
ture across a broad sampling of maize, we 
applied f, tests (38) using a sample of the diploid 
perennial teosinte Zea diploperennis (39) as the 
outgroup. The greatest diversity of maize is 
found in present-day Mexico, but whole-genome 
resequencing exists for only a handful of tra- 
ditional Mexican maize (40). We therefore 
sequenced 267 accessions of open-pollinated 
traditional maize from across Mexico (Fig. 1A 
and data S1 and 82). Applying f, tests revealed 
significant admixture with mewicana in all 
maize except the ancient Peruvian sample 
N16 (Fig. 1B and data S3). Analysis of subsets 
of the data allowing use of a greater number of 
single-nucleotide polymorphisms (SNPs) reveals 
higher /, values for N16, but these are still 
substantially lower than for all other samples 
(data S4). We find evidence for mexicana ad- 
mixture well outside of Mexico, including in 
modern samples from the US Southwest and 
the Andes, consistent with previous work (29), 
as well as a newly sequenced set of 73 tradi- 
tional Chinese varieties that represent maize 
dispersal out of the Americas after European 
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Fig. 1. Admixture from Zea mays ssp. mexicana is ubiquitous in maize. (A) Sampling of newly sequenced, published, and ancient maize genomes. See data S1 
for details on sampling. (B) f, statistics for different groups of maize. (C) Proportion of mexicana admixture estimated for ~5000 field collections from the International 
Maize and Wheat Improvement Center (CIMMYT). (D) Correlation (R* = 0.97) between the first principal compone 


maize varieties and mexicana admixture. 


colonization (data S1 and S2). We extended 
our search to ancient samples, again finding 
mexicana admixture in archaeological samples 
from the Tehuacan Valley in central Mexico 
dating to ~5300 cal BP (47) and both lowland 
and highland (>2000 m above sea level) samples 
from South America dating to ~1000 cal BP (42). 
Finally, we turned to modern breeding material, 
where, again, f, tests identify significant admix- 
ture in a diversity panel of >500 modern inbred 
lines (43). In sum, we find evidence of mexicana 
ancestry in all examined maize samples dating 
as early as ~5300 cal BP. 

To investigate the importance of introgres- 
sion to maize diversity more broadly, we ran 
STRUCTURE (44) to estimate mexicana ances- 
try in genotyping data from a much larger sam- 
ple of 5373 traditional maize varieties and 310 
wild samples of mexicana and parviglumis 
from across the Americas (45, 46). These maize 
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samples also show ubiquitous evidence of 
mexicana admixture (Fig. 1C). More sur- 
prisingly, principal component analyses of 
these maize samples reveal that the major axis 
of genetic variation across maize in the Americas 
is nearly perfectly correlated with the proportion 
of the genome showing mexicana ancestry 
[coefficient of determination (R”) = 0.97; Fig. 1D]. 
The subspecies mexicana and parviglumis 
diverged tens of thousands of years before 
domestication (78), and it thus seems plausible 
that differences in the proportion of these 
ancestries dominate more recently derived 
aspects of maize diversity. 


A novel model of maize origins 


We interpret the timing and universality of 
mexicana admixture as evidence supporting a 
novel model of maize origins (Fig. 2A). Consis- 


nt (PC1) of genetic diversity in 5373 CIMMYT traditional 


that maize dispersed out of the Balsas River 
basin in Mexico after domestication from 
parviglumis, quickly reaching South America 
by at least ~6500 cal BP (48). Then, ~6000 cal 
BP, maize was adopted by peoples living in the 
highlands of central Mexico, resulting in admix- 
ture with the sympatric mexicana (12, 26, 27, 32). 
Ancient samples showing mexicana admixture 
suggest that maize spread rapidly from the 
highlands of Mexico, replacing or mixing with 
existing populations across the Americas, intro- 
ducing mexicana alleles as it moved. As it 
entered into the lowlands of Mexico, maize 
must have once again come into contact with 
parviglumis. This model is consistent with the 
second wave of maize migration into South 
America posited by Kistler et al. (49) and sup- 
ported by recent chloroplast data (50) but fur- 
ther explains the origin of that wave and the 


tent with previous work (42, 47), we propose 
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existence of mexicana alleles in South America, 
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Fig. 2. A novel model of maize origins. (A) Admixture graph of lowland Mexican and highland Andean 
maize showing three hypothesized admixture events (dotted lines): (i) between mexicana and an ancient 
North American lineage of maize sister to N16, (ii) between admixed maize and parviglumis as maize 
moved back out of the highlands, and (iii) between admixed maize and an ancient South American lineage 
represented by N16 as admixed maize moved into South America. Estimated edge lengths and admixture 
proportions (with confidence intervals in parentheses) are shown. (B) Proposed model of maize origin 
showing two waves of movement out of Mexico: early movement after initial domestication in the Balsas 
(top, black) and a second wave out of the highlands of Mexico after admixture with mexicana (bottom, red). 


far outside mexicana’s native range, by at least 
~1000 cal BP. 

To formally evaluate our model, we fit ad- 
mixture graphs to / statistics from five tradi- 
tional maize varieties from the Andes (40) and 
118 of our newly sequenced traditional maize 
varieties from Mexico that were collected at 
low elevation (<1500 m above sea level) (57). 
Our fitted admixture graph (Fig. 2A) estimates 
that initial hybridization with mexicana was 
substantial, consistent with estimates from 
modern-day traditional maize varieties from the 
highlands of central Mexico (27). Subsequent 
admixture with parviglumis reduced the con- 
tribution of mexicana to maize ancestry out of 
the highlands, but our model estimates that 
mexicana ancestry still represents ~25% of the 
genome of extant traditional varieties in Mexico. 
A simplified version of the model estimates 
nearly identical ancestry proportions for modern 
maize inbred lines (fig. $1). Although these 
estimates are lower than those from reduced 
representation genotyping (Fig. 1C), genotyping 
SNPs overestimate genome-wide admixture pro- 
portions owing to their biased distribution across 
the genome (57) (fig. S2). A thorough evaluation 
of alternative graphs (52) found models with 
nominally better fits to the data (53), but none 
had significantly better out-of-sample predic- 
tive ability (data S5; lowest empirical bootstrap 
P value of 0.38), and we were unable to identify 
a better-fitting graph with fewer admixture 
events. Although many of the alternative graphs 
include implausible phylogenetic histories, and 
some incorporate hybrid parviglumis-mexicana 
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populations (54), all of the alternative graphs 
qualitatively support our proposed model in 
requiring postdomestication admixture with 
mexicana. Finally, we note that our model is 
consistent with an independent population 
genetic approach (55) that estimated the timing 
of mexicana admixture at 5716 years (+5614) 
(51), which is exceptionally close to the earliest 
archaeological evidence of maize in the high- 
lands (32) and substantially later than the first 
evidence of domesticated maize (56). 
Together, the confluence of archaeological 
and genetic data suggests that mexicana ad- 
mixture was central to the widespread use and 
dispersal of maize in the Americas by ~4000 cal 
BP (Fig. 2B). The timing of admixture between 
maize and mexicana in the highlands of Mexico 
between 6000 and 4000 cal BP corresponds 
with observed increases in cob size and the 
number of seed rows in archaeological samples 
(57-59). Southward dispersal of maize varieties 
with mexicana admixture coincides with the 
appearance of improved maize varieties in 
Honduras (49) but contrasts with a coeval move- 
ment of peoples northward (60). Archaeological 
samples demonstrate the presence of maize 
as a staple grain in the neotropical lowlands 
of Central America subsequent to mexicana 
admixture, between 4700 and 4000 cal BP 
(59, 61, 62). Ultimately, all varieties of maize 
in Mesoamerica had mexicana admixture by 
~3000 cal BP as it became a staple grain across 
the entire region (28, 58, 59, 63). Early Meso- 
american sedentary agricultural villages also 
began developing at this time, forming the basis 
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for demographic expansion and the emergence 
of later state-level societies dependent on more- 
intensive forms of maize agriculture (64-67). 


Variation in admixture along the genome 


Having established a central role for both 
parviglumis and mexicana in the origins of 
modern maize diversity, we next explored varia- 
tion in mexicana ancestry across the genome. 
Using unadmixed parviglumis and mexicana 
individuals (39) as references, we applied an 
ancestry hidden Markov model to identify 
regions of mexicana ancestry along individual 
maize genomes (5/7). In close agreement with 
our admixture graph, we estimate 15 to 25% 
average mexicana ancestry across 845 maize 
genomes (mean: 18%; data S6) (68). This varia- 
tion in total ancestry among modern maize is 
much greater than that predicted from a single 
pulse of ancient admixture (57) and likely reflects : 
a combination of selection as well as ongoing 
gene flow in parts of the range (27). Mexicana 
ancestry also varies considerably along the 
genome (Fig. 3, A and B). The vast majority of 
introgressed haplotypes are small—on the scale 
of 10 kb (fig. S3)—consistent with a relatively 
ancient origin. In addition to numerous small 
introgressed haplotypes, we identify signals 
consistent with an important role for inversion 
polymorphisms. These include the apparent 
presence of the large inversion Inv4m-—a well- 
studied target of adaptive introgression in maize 
from highland environments (26, 27)—in two 
Chinese inbred lines and one traditional 
Mexican variety (Fig. 3A). We also see high levels 
of mexicana admixture in the region of Jnvin, a 
50-Mb inversion common in parviglumis but 
rare in mexicana and entirely absent in maize 
(69) (fig. S4). Finally, we estimate drastically 
decreased levels of mexicana introgression for 
chromosomes 5, 8, and 9 (fig. S5), which we 
speculate may be attributable to the presence 
of a recently characterized genetic incompati- 
bility on chromosome 5 (70) and multiple large 
mexicana-specific inversions on chromosomes 
8 and 9 that could hinder introgression by re- 
pressing recombination (/8, 22). 

A detailed look at admixture along individual 
genomes enabled us to begin to investigate the 
functional significance of variation in mexicana 
admixture. First, consistent with the possibility 
that mexicana alleles may have served to com- 
plement recessive deleterious genetic variants 
that rose to appreciable frequency in early do- 
mesticated maize (40), we find significantly 
lower genetic load on introgressed mexicana 
haplotypes (57) (fig. S6). We then turned to 
individual loci, identifying regions of the ge- 
nome in which high-confidence mexicana alleles 
(>90% posterior probability) were at high fre- 
quency (>80%) across all modern maize, consist- 
ent with recent positive selection (57) (Fig. 3, A 
and B, and fig. $7). We found these loci clus- 
tered into 11 regions, which overlap quantitative 
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Fig. 3. Variation and functional validation of mexicana admixture. (A) (Top) 
Number of high-confidence mexicana alleles (>90% posterior probability) that 
exist in >80% lines of all modern maize along chromosome 4 (black points) and 
average mexicana ancestry (red). (Bottom) Mexicana ancestry of three inbred 
lines in the region around chromosome inversion Inv4m. (B) (Top) Number of 
high-confidence mexicana alleles (>90% posterior probability) that exist in 
>80% lines of all modern maize along chromosome 7 (black points) and average 
mexicana ancestry (red). (Bottom) mexicana ancestry in B73 across the 


ZmPRR37a gene model (black bar). The differences of days to anthesis for 
nontransgenic (NT) and overexpression (OE) lines of ZmPRR37a in (C) long-day 
(LD) conditions (2022, China, 124°49'E, 43°30'N) and (D) short-day (SD) 
conditions (2021, China, 108°43’E, 18°34'N). The data in (C) and (D) are 
means + SE. The numbers in each column indicate the sample sizes. The level of 
significance was determined by a two-tailed Student's t test. (E) Nontransgenic 
and two independent overexpression lines of ZmPRR37a grown in long-day 
conditions. Scale bar, 10 cm. 


trait loci for agronomically relevant phenotypes 
(71) and include genes with well-studied func- 
tions in Arabidopsis such as disease resistance 
and floral morphology (data S7). 

We focused on one region on chromosome 
7, where we found a narrow peak of high- 
frequency mexicana alleles that overlaps with 
maize-teosinte flowering time quantitative 
trait loci (77) and is centered on the gene 
Zm00001d022590, also known as ZmPRR37a 
(Fig. 3B). Alleles from mexicana at ZmPRR37a 
SNPs are found in up to 89% of all maize, in- 
cluding the reference genome line B73 (fig. S8). 
ZmPRR37a is thought to be involved in the 
circadian clock-controlled flowering pathway 
(72) and is an ortholog of the sorghum gene 
Mati, which controls flowering under long-day 
conditions (73). To validate this function, we 
obtained a CRISPR-Cas9 knockout mutant 
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from a targeted mutagenesis library (74) and 
developed two transgenic overexpression lines 
(51). Consistent with its hypothesized role in 
response to day length, ZmPRR37a knockout 
mutants exhibited a significantly earlier flower- 
ing phenotype in long-day conditions (two- 
tailed Student’s ¢ test, P values are indicated in 
fig. S9, A, B, and D) but showed no effect in 
short-day conditions (two-tailed Student's ¢ test, 
P values are indicated in fig. S9, A and C), and 
overexpression lines exhibited significantly 
later flowering in both long- and short-day con- 
ditions (two-tailed Student's ¢ test, P values are 
indicated in Fig. 3, C to E). Maize carrying the 
mexicana introgression at ZmPRR37a shows 
lower levels of expression than parviglumis 
(75), and our functional evaluation thus sug- 
gests that mewxicana alleles at ZmPRR37a may 
have helped maize adapt to earlier flowering 
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in long-day conditions as it expanded out of 
Mexico to higher latitudes. 


Admixture with mexicana underlies phenotypic 
variation in maize 


Admixture with teosinte has been associated 
with phenotypic variation for a number of traits 
in traditional maize (76), and mexicana gene 
flow has been instrumental in the phenotypic 
adaptation of maize to the highlands (26, 77-79). 
Our analysis of teosinte ancestry across named 
varieties replicates historical estimates based 
on morphology (fig. S10 and data S8), suggesting 
a broader role for mexicana ancestry in pattern- 
ing phenotypic variation in maize. Indeed, if 
mexicana admixture played a key role in the 
dispersal and use of maize, mexicana alleles 
should contribute to agronomically relevant 
phenotypic variation. We thus combined our 
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estimates of admixture with data from 33 
phenotypes to perform multivariate admixture 
mapping across 452 maize inbreds (57). At a 
false discovery rate of 10%, we identified 92 
associations, which we grouped into 22 peaks 
representing 25 candidate genes (Fig. 4, fig. S11, 
and data S9 and S10). These include a signif- 
icant association with zeaxanthin—a carotenoid 
pigment that plays a role in light sensing and 
chloroplast movement (80) and is of signif- 
icance to human health (87)—approximately 
1-kb downstream of the gene ZmZEPI, a key 
locus in the xanthophyll cycle that regulates 
zeaxanthin abundance in low-light conditions 
(fig. S12A). Haplotype visualization reveals clear 
sharing between maize and mexicana (fig. S12B), 
and the mexicana-like haplotype increases the 
expression of ZmZEPI and reduces zeaxanthin 
content in maize kernels (fig. S12, C and D). 
We also see associations with well-known lipid 
metabolism genes such as dgat] and fae2 (82). 
The mexicana allele at dgatl is associated with 
a decrease in the proportion of linoleic acid but 
an increase in overall oil content, but variation in 
mexicana ancestry is not in linkage disequilibrium 


with the well-studied amino acid variant at this 
locus (83) (fig. S13). Although expression of dgat 
has been suggested to play a role in cold toler- 
ance in maize and Arabidopsis (84, 85), a pre- 
liminary experiment in maize seedlings failed 
to identify differences in cold tolerance in lines 
of varying ancestry at dgatl (fig. S14). Finally, 
in addition to identifying compelling candi- 
date loci in modern inbreds, we applied a novel 
genotype-by-environment association map- 
ping approach (86) in a large set of traditional 
maize varieties evaluated across 13 different 
common garden trials (87). We find a strong 
association on chromosome 1 (Fig. 4C), where 
mexicana ancestry increases cob size. The 
candidate gene closest to the associated SNP, 
Zm00001d029675, was recently identified as a 
target of selection during breeding efforts in 
both the United States and China (88). 
While genome-wide association studies 
(GWASs) can identify individual loci with large 
effects, it is likely that mexicana admixture 
contributes important variation of smaller ef- 
fect size to polygenic traits. To test this hypoth- 
esis, we used our inbred association panel and 
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Fig. 4. Phenotypic impacts of mexicana admixture. (A) Effect sizes (scaled 
by trait standard deviation) across traits for the 22 lead SNPs from admixture 
GWAS in the inbred diversity panel. Lead SNPs are the lowest P value SNP 
within 500-kb windows around significant associations. Gray boxes represent 
missing data owing to low minor allele frequency. Black outlines show the trait 
with the largest absolute value effect size for each SNP. Numbers above each 
group of columns represent chromosomes, while numbers below represent 
megabase positions. Trait name acronyms and descriptions are in data S9, 
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phenotype data to estimate the proportion 
of additive genetic variance contributed by 
mexicana (51) (data S11). We estimate that 
mexicana admixture explains a meaningful pro- 
portion of the additive genetic variation for 
many of these traits, including nearly 25% for 
the number of kernels per row, 15% for plant 
height, 10% for flowering time, and 15 to nearly 
50% for multiple disease phenotypes (Fig. 4D). 


Discussion 


Conflicting archaeological, cytological, genetic, 
and geographic evidence led to two irreconcilable 
models for the origin of maize. In this study, with 
more than 1000 genomes of maize and teosinte, 
including 338 newly sequenced traditional varie- 
ties, we revisited the evidence for admixture be- 
tween maize and its wild relative Zea mays ssp. 
mexicana. We propose a new model of maize 
origins, which posits that, after admixture with : 
mexicana in the highlands of central Mexico, 
admixed maize spread across the Americas, 
either replacing or hybridizing with preexisting 
maize populations. While this model is consis- 
tent with both genetic and archaeological data, 
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and trait acronym colors represent categories shown in (D). (B) Manhattan 
plot of admixture GWAS for linoleic acid content in the inbred diversity panel. 
The peak includes the gene dgat]. (C) Manhattan plot of admixture GWAS for 
cob weight using traditional maize varieties. Red points in (B) and (C) represent 
lead SNPs. (D) Variance partitioning in the inbred diversity panel. Shown 

is the proportion of additive genetic variance (Va) explained by mexicana 
admixture, with each point representing the estimate for a single 
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it also raises a number of questions. Among 
these, most notable is perhaps the question of 
why and how this secondary spread occurred— 
was it due to some advantage of the admixed 
maize over earlier domesticated forms, or was 
the spread coincidental with demic or cultural 
exchange among human populations (67)? 
Changes in maize cob morphology and die- 
tary isotope data from human populations in 
Central America indicate a transition between 
early cultivation and the use of maize as a staple 
grain between 4700 and 4000 cal BP (59). This 
timing suggests a possible direct role for hy- 
bridization between maize and mexicana in 
improving early domesticated forms of maize. 
To better understand why admixed maize may 
have been beneficial for early farmers, we sought 
to investigate associations between mexicana 
alleles and phenotypes in extant maize. We 
identified and functionally validated a locus 
important for photoperiodicity and flowering 
time and found candidate genes associated 
with important agronomic phenotypes, includ- 
ing nutritional content and the size of kernels 
and cobs. None of these loci individually, how- 
ever, are likely sufficient to drive a large advantage 
of admixed maize. And although we show that, 
combined, alleles introgressed from mexicana 
explain a meaningful proportion of additive 
genetic variance for agronomic and disease 
resistance traits, it remains unclear whether 
this novel variation could drive rapid adoption 
of admixed maize. In addition to variation at 
these specific phenotypes, admixture may have 
played a role in the spread of maize by aug- 
menting genetic diversity and ameliorating 
genetic load in early domesticated populations, 
perhaps even providing some generalized hybrid 
vigor. Indeed, we show that mewicana alleles carry 
less load than maize alleles (fig. S6), and maize- 
mexicana hybrids show extensive heterosis for 
both viability and fecundity. This process could 
be augmented by similar ecologies as well—the 
global ecological niche of cultivated maize more 
closely reflects that of mexicana than parviglumis 
(69), and, like maize, mexicana has successfully 
colonized novel habitats at higher latitudes (89). 
Modern ethnographic evidence is also consis- 
tent with these ideas, as farmers continue to 
introgress teosinte into their maize populations 
to make them “stronger” (J6, 90, 91). 
Introgression between relatives has long been 
recognized as a major source of plant adaptation 
(92), yet only with the advent of molecular 
markers have we begun to recognize the key 
role that gene flow from wild relatives has 
played in crop evolution (93). Here, with exten- 
sive sampling and genomic coverage of both 
traditional and modern varieties as well as wild 
relatives and ancient samples, we argue that 
introgression from a close wild relative of maize 
was pivotal to its success as a staple crop. The 
presence of adaptive variation in wild relatives 
is not specific to maize, and we predict that a 
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similar history will be revealed for many other 
crops. Indeed, preliminary results already sug- 
gest a key role for hybridization in the evolution 
of rice, tomato, barley, and others (94-96). 
These results not only highlight the past im- 
portance of crop wild relatives but also point 
to their potential as a source of adaptive 
diversity for future breeding. Most importantly, 
the work presented here suggests that, for many 
crops, millennia of diligent efforts by early 
farmers have capitalized on this diversity and 
that an abundance of relevant functional di- 
versity may already be segregating in tradition- 
al varieties or preserved ex situ in germplasm 
gene banks. 


Materials and methods summary 


SNP data from 507 modern maize inbred lines, 
90 Z. mays ssp. Mexicana, 75 Z. mays ssp. 
parviglumis, and two Z. diploperennis were 
obtained from version 1 of the ZEAMAP project 
(39). We also sequenced an additional 338 tra- 
ditional maize varieties, including 267 from 
across Mexico and 71 from China (data S1 
and S2) and collected DNA sequencing data of 
30 published traditional varieties and 10 ancient 
maize samples (data S1). For these additional 
genomes, we called sites from the enlarged 
ZEAMAP of these lines. For ancient maize, we 
did complete quality control on the raw reads 
by cutting low-quality bases and removing the 
adaptors using fastp (97). Then we adopted an 
ancient DNA mapping method optimized for 
reducing reference sequence bias and improving 
the accuracy and sensitivity of ancient DNA 
sequence identification (98). We used map- 
Damage2 (99) to estimate damage parameters 
from the bam files, and then we rescaled base 
quality scores according to the probability 
that a base derives from deamination (J00). 
We performed pseudohaploid calling with 
given ZEAMAP sites using ANGSD (107). SNPs 
supported by <2 reads and reads with mean 
Phredscore of <20 and mapping quality of <20 
were filtered. The A alleles located in the 3’ 
end (<30% of the supporting reads) and the 
T alleles located in the 5’ end (<30% of the 
supporting reads) were hard masked. The f, 
test was carried out by ADMIXTOOLS 2 (52) 
with Z. diploperennis as the outgroup and 
our unadmixed mexicana and parviglumis as 
the two contributors to the test population. 
Admixture graphs were estimated and com- 
pared using ADMIXTOOLS2 (52). The timing 
of admixture between mexicana and maize 
was estimated by DATES (55). Admixture with 
mexicana in CIMMYT SeeD GBS samples was 
estimated by STRUCTURE (44). The genome- 
wide patterns of introgression of all 845 maize 
were investigated by ELAI (102). We defined 
high-frequency mexicana alleles as those for 
which >80% of the 845 maize lines had ELAI 
scores > 1.8. The functions of ZmPRR37a were 
investigated by transgenic overexpression or 
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CRISPR-Cas9 gene editing. The constructed 
overexpression and gene-editing vectors were 
transformed into maize inbred line KN5585. 
Genome-wide association of mexicana ancestry 
was performed using JointGWAS (86) for 33 
phenotypes in an inbred association panel (103) 
and a multisite set of phenotypic trials of tra- 
ditional varieties (87). Variance partitioning of 
phenotypes of 507 maize lines was performed 
by LDAK (104) using the kinship calculated by 
OSCA (105) from ELAI scores. All details of the 
materials and methods, including those sum- 
marized above, are provided in the supple- 
mentary materials. 
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in meiosis requires the chromosomes in each 
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Recombination is initiated by the induction of 
hundreds of programmed DNA double-strand 
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RATIONALE: Despite the central role of meiotic 
DNA DSBs in generating eggs and sperm, their 
impact on de novo mutation is not well under- 
stood. Furthermore, much remains unknown 
about the range of repair pathways used, and 
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DNA breaks in meiotic germ cells 


DNApolymerases mediated end-joining 


in meiosis. Most meiotic DSBs occur in narrow 
regions of the genome called “hotspots.” We 
generated high-resolution maps of human 
mutation relative to hotspots by leveraging 
population-scale resources of human diversity. 
These mutations comprise hundreds of mil- 
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large-scale genome changes known as struc- 
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compared their positions with the localization 
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break repair (“footprints”). 


RESULTS: We found that one in four sperm 
and one in 12 eggs in humans contains a de novo 
single-base substitution attributed specifically 
to repair of meiotic breaks. Occurrence of a 
DSB increases the risk of indels and SVs in 
the vicinity even more strongly. On the auto- 
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short indels and 400- to 1000-fold incre. 
in SVs per break. The impact of meiotic breaks 
on the X chromosome is larger and distinct 
from that on the autosomes. Although SVs 
are biased toward insertions on the autosomes, 
deletions are particularly strongly elevated on 
the X chromosome, exhibiting a 1300-fold in- 
crease in rate per break. 

Some of these mutations have an impact on 
human health. We observed a 41% increase in 
pathogenic mutations in exonic regions over- 
lapping hotspots, with >350 genes genome 
wide affected by pathogenic or loss-of-function 
mutations attributed to meiotic break repair. 
These genes are associated with a range of X- 
linked and autosomal developmental disorders, 
neurological and autoimmune conditions, and 
cancers. 

We uncovered multiple new mutational 
footprints and signatures, which implicate un- 
expected biochemical processes in meiotic break 
repair and provide evidence that a range of 
error-prone DNA repair pathways normally 
associated with somatic and cancer cells are 
active in meiosis. For example, the nature and 
localization of many single-base substitutions 
are consistent with being generated through 
specialized low-fidelity DNA polymerases and 
flawed processing of break ends. Multiple lines 
of evidence point toward mechanisms involv- 
ing joining of DNA break ends in generating 
indels and SVs, e.g., microhomology-mediated 
end joining especially on the autosomes and 
nonhomologous end joining near telomeres 
and on the X chromosome. 


CONCLUSION: Repair of meiotic breaks is a sig- 
nificant, direct, and underappreciated source 
of human de novo mutation. We provide 
evidence that these mutations are gener- 
ated through a compendium of error-prone 
repair mechanisms that have hitherto been 
thought to be unused or suppressed in meiosis. 
Many of these mutations lead to pathogenic 
disruption of gene function. Our findings 
shed light on how the choice of mechanism 
for break repair and its consequences vary 
depending on context, with a higher muta- 
tion rate per break in males and distinct com- 
positions of mutations in different genomic 
regions. Furthermore, our results imply that the 
evolutionary benefit of increased genetic di- 
versity that is afforded by meiotic recombi- 
nation comes with a substantial mutational 
and disease burden. 
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HUMAN GENETICS 


Meiotic DNA breaks drive multifaceted mutagenesis 


in the human germ line 


Robert Hinch’, Peter Donnelly”*, Anjali Gupta Hinch?* 


Meiotic recombination commences with hundreds of programmed DNA breaks; however, the degree to 
which they are accurately repaired remains poorly understood. We report that meiotic break repair 

is eightfold more mutagenic for single-base substitutions than was previously understood, leading to de 
novo mutation in one in four sperm and one in 12 eggs. Its impact on indels and structural variants 

is even higher, with 100- to 1300-fold increases in rates per break. We uncovered new mutational 
signatures and footprints relative to break sites, which implicate unexpected biochemical processes and 
error-prone DNA repair mechanisms, including translesion synthesis and end joining in meiotic break 
repair. We provide evidence that these mechanisms drive mutagenesis in human germ lines and lead to 


disruption of hundreds of genes genome wide. 


eiotic recombination is essential for 
creation of gametes in most sexually 
reproducing species (1). It shuffles gen- 
etic material and, together with mu- 
tation, creates all genetic diversity. 
Recombination is initiated by the induction of 
hundreds of programmed DNA double-strand 
breaks (DSBs) (2, 3). In many vertebrates, in- 
cluding humans, these breaks cluster into narrow 
regions of the genome called “hotspots,” which 
are bound by the protein PRDM9 in a sequence- 
specific manner (4-6) (Fig. 1A). DNA is cut by 
SPO1I1, followed by resection, which generates 
2 to 4 kb of single-stranded DNA (ssDNA) per 
break on average. In humans, a relatively small 
number of breaks (~45 to 60 in males and ~80 
to 95 in females) are repaired with a crossover 
(7, 8). Most breaks are repaired without a 
crossover through a distinct homologous re- 
combination (HR) pathway, also using the 
homologous chromosome as template (9). 
This leads to short segments of DNA being 
copied from the homologous chromosome, and 
these are known as noncrossovers. Any remain- 
ing breaks are thought to be repaired using 
HR with the sister chromatid (7). Other repair 
pathways that are used in nonmeiotic cells, 
such as end joining, are thought to be sup- 
pressed in meiosis (70). Nevertheless, end joining 
can occur during meiosis in mutant mice that 
lack a critical DNA repair protein (77) or when 
DSBs are induced by radiation (72). 
Recombination and mutation are traditionally 
thought of as distinct mechanisms generating 
genetic variation. Nevertheless, work in yeast 
has shown that meiotic cells accumulate more 
mutations than mitotic cells (73, 74) in a SPO11- 
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dependent manner (15) [reviewed in (/6)]. 
Studies in nonmeiotic systems in bacteria and 
yeast have shown that repair of DNA breaks is 
mutagenic (17, 18) [reviewed in (79)]. Although 
there are major differences between meiotic 
and nonmeiotic break repair (1), flawed repair 
is a likely cause of mutagenesis in yeast mei- 
osis (16). 

In humans, indirect population-based ap- 
proaches have shown an excess of rare C>G, 
A>G, and C>T single-nucleotide polymor- 
phisms (SNPs) in recombination hotspots (20), 
with C>G elevation in specific scenarios [e.g., 
aging oocytes (21, 22) and X chromosomes in 
males (23)]. Structural variants (SVs), defined 
as polymorphisms affecting 50+ base pairs (bp) 
of DNA, are overrepresented in hotspots (24). 
Minisatellite repeat instability (24-26) and 
genome rearrangements between regions of 
high sequence homology can occur from flawed 
recombination (27, 28). A recent study leveraged 
the extensive pedigree information in Iceland 
to establish that de novo single-base substitu- 
tions are enriched near crossovers (7), which is 
consistent with findings from sperm typing (29). 

Despite these developments, the provenance 
of mutations associated with meiotic breaks 
remains poorly understood across all size scales, 
especially in higher eukaryotes. The burden of 
single-base substitutions caused by the vast 
majority (~90%) of DSBs, i.e., those not repaired 
with a crossover, is unknown. Indels, which are 
defined as insertions or deletions <50 bp long, 
in unique DNA remain unexplored. The na- 
ture, provenance, and impact of SVs caused 
by this process also remain uncharacterized in 
general. 

To address these issues, we have harnessed 
a range of population-scale resources to con- 
struct detailed base-pair resolution maps of 
mutation relative to human recombination 
hotspots. These data include de novo and ex- 
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tremely rare genetic variation, comprising 
341 million SNPs, 64 million short indels, and 
0.5 million SVs. These high-resolution maps 
enable us to characterize sequence properties 
of mutations and compare their footprints with 
the localization of distinct molecular processes 
taking place within hotspots (Fig. 1A). They 
reveal the scale of mutagenesis and link it with 
particular DNA repair processes, thereby pro- 
viding new insights on the nature, impacts, and 
mechanisms of these errors in the human 
germ line. 


Burden of de novo single-base substitutions 
due to meiotic break repair is eightfold higher 
than previously understood 


Unlike crossovers, other repair outcomes such 
as noncrossovers are either difficult or impossible 
to detect in gametes or pedigrees (9, 30, 31). As 
a result, their mutational impact is unknown : 
despite comprising ~90% of DSB outcomes 
(8, 9, 31-35). To solve this problem, we took 
an indirect approach: We measured the num- 
ber of de novo mutations (DNMs) in recombi- 
nation hotspots while controlling rigorously 
for local mutation rates. Recombination hotspots 
vary according to the DNA-binding specificities 
of their PRDM9 alleles, with the so-called 
“A-like” alleles being the most common (36, 37). 
We identified the locations of 28,286 human 
recombination hotspots at base-pair resolution 
by applying our published methodology (38) to 
published chromatin immunoprecipitation, 
followed by ssDNA-sequencing data measuring 
the occupancy of a key meiotic DSB repair 
protein (DMC1) in testes of an individual homo- 
zygous for the A allele of PRDM9 (20, 39). DMC1 
hotspots have similar localization in males and 
females (40), and our approach is robust to sex 
differences in hotspot intensities (47). These 
hotspots comprise ~2% of the genome. 

For each hotspot, we counted de novo single- 
base substitutions identified in 2976 Icelandic 
trios (7) that were within 1.5 kb of the hotspot 
center, as defined by the midpoint of its PRDM9- 
binding site. The number of DNMs was cor- 
related with a measure of hotspot intensity [P = 
10°”; table S1 (4)]. However, by far most DNMs 
were not associated with a crossover (Fig. 1B). 
Note that not all mutations in hotspots will be 
due to recombination (Fig. 1B) and will be 
affected by local factors [e.g., GC content (42)]. 
To calculate the number of DNMs that are 
due specifically to meiotic break repair in each 
parent, we therefore accounted for local mu- 
tation rates and other factors [fig. S1, A and B 
(41)]. When restricted only to positions at which 
crossovers occurred in particular meioses, this 
approach gave similar mutation rate estimates 
to previously reported direct measurements 
(7), which provides validation of our indirect 
method. 

The sex-specific DNM rates due to meiotic 
break repair (including all repair outcomes) 
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we inferred are 0.234 [95% confidence interval 
(CI) = 0.180 to 0.288] DNMs per paternal 
and 0.080 (95% CI = 0.048 to 0.114) DNMs 
per maternal meiosis. These are eightfold 
higher than the mutation burden per meio- 
sis due to crossovers alone in both males and 
females. The contribution of crossovers is 0.028 
(95% CI = 0.022 to 0.034) for males and 0.011 
(95% CI = 0.005 to 0.013) for females in these 
hotspots. They comprise ~0.5% of the total 
burden of DNMs genome wide and imply that, 
on average, about one in four sperm and about 
one in 12 eggs has a DNM specifically due to 
repair of meiotic breaks. 

The proportion of DNMs resulting from cross- 
over (~12% for males and ~13% for females) 
is comparable to the proportion of DSBs that 
is resolved as crossover. These data support 
the parsimonious view that single-base sub- 
stitutions in hotspots are driven by DSBs 
and are not strongly influenced by the partic- 
ular repair outcome. Because the number of 
crossovers per meiosis is well understood (7), 
we can estimate the average number of DSBs 
per meiosis under this model (41). These are 
441 (95% CI = 327 to 590) for males and 620 
(95% CI = 349 to 1045) for females. The in- 
ferred sex-averaged DNM rate per break is 
shown in Fig. 1C. 

We estimated that the single-base substitu- 
tion rate per programmed DSB is 6.6 x 10~* in 
males and 2 x 10~* in females on average (41). 
This implies that break repair is more error- 
prone in males, with mutation rate per break 
in male meiosis being about threefold higher 
than that in female meiosis. It is noteworthy 
that DNMs in the rest of the genome also occur 
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place (not drawn to scale). PRDM9 binding is followed by induction of programmed 
DNA DSBs by SPO11. SPO11 is released from break ends, and resection degrades 


side of break, which is to the left of the DSB on the forward strand and to its 
ight on the reverse strand. These are bound by the key repair proteins DMC1 and 
RAD51, with DMC1 binding close to the DSB site and RAD51 away from it. DNA 
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about three to four times as often on the pa- 
ternal relative to the maternal genome (7, 43). 


Footprints and mechanisms of single-base 
substitutions in hotspots 


DNA breaks and a succession of repair pro- 
cesses occur within distinct segments inside 
recombination hotspots (Fig. 1A). PRDM9 binds 
a sequence motif at the center of hotspots, and 
SPO11 induces breaks within ~100 bp from it. 
DNA on either side of the break undergoes 1 to 
2 kb of resection to produce ssDNA, which is 
bound by the repair proteins DMC1 and RAD51 
(Fig. 1A). Resynthesis of DNA closest to the 
break site occurs within the context of a criti- 
cal recombination intermediate known as the 
D-loop (Fig. 1A). It is not known how the re- 
maining DNA, which comprises most of the 
resected segment, is repaired for most breaks. 
A natural question, therefore, is whether 
mutation rates and outcomes are different in 
these distinct segments of hotspots, and the 
answer to this question may further our under- 
standing of the underlying mechanisms and 
repair processes. However, the resolution even 
with thousands of trios is insufficient to distin- 
guish mutations in these segments (Fig. 1C). To 
overcome this problem, we leveraged extremely 
rare polymorphisms as a proxy for DNMs. This 
approach effectively includes DNMs that have 
arisen relatively recently in humans, and the im- 
pact of selection and drive on them is expected 
to be small (23, 44). We used whole-genome poly- 
morphism data provided by gnomAD, com- 
prising >70,000 genomes from several global 
populations (45). We filtered for high-quality 
variants with an allele frequency <10~°, which 
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Distance from hotspot centre (bp) 


hotspots (n = 25,440) were sorted by their DMC1 intensity and divided into five 
equal bins. The average number of paternal and maternal DNMs in hotspots per 
proband per base in each bin is shown: all DNMs within 1.5 kb of hotspot centers 
(red), DNMs within 1.5 kb of hotspot centers but not associated with a crossover 
(orange), and DNMs between 5 and 20 kb from hotspot centers (blue; rescaled 
for a 3-kb window to facilitate comparison). (€) Only a small proportion of 
hotspots (<1%) experience a break in any given meiosis. Here, we show the 
DNM rate per break inferred from the enrichment of DNMs in hotspots (3-kb 
moving window). The center of a hotspot is defined 


as the midpoint of its 


leaves 341 million single-base substitutions. This 
represents an ~1800-fold increase in the num- 
ber of mutations relative to the trios above. 

Mapping these mutations to hotspots reveals 
unexpected structure in the mutagenesis profile: 
a wide base of mutations +2 kb from the hotspot 
center (Fig. 2A), a peak of mutagenesis within 
100 bp of the hotspot center (Fig. 2, A and B), 
and an additional intense peak within the 
PRDM9-binding site itself (Fig. 2B). Directly 
measured DSBs in mouse are also concen- 
trated within ~100 bp from hotspot centers 
and have a further peak within the PRDM9- 
binding site (46). 

Analysis of the mutation spectrum shows 
that all six single-base substitutions (C>T, C>G, 
C>A, A>G, A>C, and A>T) and their reverse 
complements are enriched with distinctive 
footprints (Fig. 2, C to E, and fig. S2, A to E). 
Whereas DSBs affect both DNA strands, sev- 
eral downstream repair processes are strand 
specific (Fig. 1A). As a result, mutations that 
arise due to strand-specific processes may ex- 
hibit strand “asymmetry.” For example, a muta- 
tion type may be enriched on one side of a 
break and its reverse complement on the other 
side. The presence or absence of strand asym- 
metry in mutations is thus informative about 
the nature of processes giving rise to them. 

We observed that mutations in the central 
+100 bp do not exhibit significant strand asym- 
metry, which is consistent with them arising as 
a result of flawed processing of DNA break ends 
(Fig. 1A). By contrast, mutations in the flank- 
ing 2 kb exhibit strong strand asymmetry and 
are composed of C>N and A>N mutations on 
the forward strand and G>N and T>N mutations 
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on the reverse strand (Fig. 2, C to E, and fig. S2, 
Ato E). 

Further analysis revealed distinct footprints 
within these flanks. Several mutation types 
(C>G, A>G, and C>A) have “off-center” peaks 
700 to 800 bp from the center (Fig. 2, D and 
E, and fig. S2, A, B, and E). For C>G muta- 


tions on the X chromosome, for example, the 
off-center peaks are particularly intense, sub- 
suming any central signal (Fig. 2E). These 
peaks match the binding profile of the RAD51 
recombinase in hotspots (Fig. 2F). Recent 
work in mouse (47) has shown that this is also 
the stretch of ssDNA that lies outside the 
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window). The figure corrects for differences in sequence composition in and 
around hotspots. (D) Same as (C), but showing C>G and G>C mutations per 

C and G base, respectively (200-bp moving window). (E) Same as (D), but for the 
X chromosome. (F) DNA-binding footprint of RAD51 in mouse hotspots measured 
with ChIP-seq for RAD5S1 in mouse testes (47). (G) Same as (E), but combining 


that exhibit “off-center” peaks of mutations, namely 
>TpG and their reverse complements on the X 


chromosome. (H) Same as (C), but for C>T (excluding CpG>TpG), A>T, and A>C 


ents on the autosomes. 


“D-loop” (Fig. 1A), suggesting mechanism(s) 
involving DNA resynthesis distal to the break 
site. The footprint of the remaining mutations 
resembles localization of ssDNA in hotspots, 
including ssDNA in the D-loop (Figs. 1A and 
2H), and is consistent with hypermutation 
within it (48, 49). 
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In summary, our analyses have revealed un- 
expected complexity in mutagenesis, implicating 
three distinct factors, namely SPO11-mediated 
break machinery, ssDNA hypermutation, and 
DNA resynthesis outside the D-loop, that underlie 
the overall increase in single-base substitu- 
tions in hotspots. We investigate the mecha- 
nisms underlying these factors below. 


Hotspots are a source of indels and SVs 


To understand the role of meiotic breaks in 
generating length polymorphisms, we extended 
the approach above to indels and SVs. For indels, 
we used whole-genome data from gnomAD, 
comprising 64 million indels after filtering for 
variant quality. For SVs, we included the fol- 
lowing datasets: (i) gnomAD-SV (387,780 SVs 
from 10,847 individuals from four global popu- 
lations based on short-read data) (50) and (ii) 
deCODE (133,886 SVs from 3622 Icelanders 
based on long-read data) (24). The two SV 
datasets have contrasting strengths (size versus 
read length) and we used them as replication 
datasets. 

First, we consider indels. Insertions are ~399- 
fold higher (95% CI = 395 to 403) and deletions 
are 115-fold higher (95% CI = 113 to 117) per 
break than would be expected in these regions 
in the absence of the break, i-e., from genomic 
insults collectively in the germ line and the 
zygote (Fig. 3A). The footprint of 1-bp indels 
(Fig. 3B) shows strong concentration in the 
PRDM9-binding motif, similar to the footprint 
of single-base substitutions (Fig. 2B). 

SVs comprise mainly deletions and insertions 
(0). Hotspots harbor one or both breakpoints 
of 7% of autosomal SVs. SV insertion break- 
points are 930-fold elevated (95% CI = 848 to 
1008), and SV deletion breakpoints are 431-fold 
elevated (95% CI = 376 to 488) per break (Fig. 
3C and fig. $3, A and B). As is the case with 
(short) indels, SVs in autosomal hotspots are 
biased toward insertions. 

The bias toward insertions is reversed on 
the X chromosome (Fig. 3D and fig. S3C). Al- 
though there is an increase in SV insertions 
per break (243-fold, 95% CI = 43 to 456), hot- 
spots on the X chromosome are a particularly 
significant source of SV deletions. They exhib- 
it a 1343-fold increase (95% CI = 885 to 1835) 
in frequency per DSB (Fig. 3D) and harbor 
breakpoints of 10% of SV deletions on the X 
chromosome. 

We characterize indel polymorphisms fur- 
ther below, stratifying them by the DNA con- 
text of their breakpoints (unique DNA or any 
of the known repetitive DNA families) to ac- 
count for underlying differences in mutation 
propensity. 


Indels in hotspots are larger and biased 
toward insertions in the autosomes 


First, we consider indels in unique DNA. Here, 
deletions outnumber insertions 2:1 outside 
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autosomal hotspots (Fig. 4A). We observed the 
opposite inside hotspots, with a strong excess of 
insertions (Fig. 4A). Indels in hotspots are larger 
and more numerous than those outside (Fig. 4, 
Band C, and fig. $4, A to D). The number of 
insertions and deletions is correlated with 
hotspot intensity (P = 2 x 10°”° for insertions 
and P = 9 x 10 ™ for deletions; fig. $4, E and F). 
They are particularly elevated in hotspots close 
to telomeres (P = 7 x 10° and P = 2 x 104, 
respectively) over and above the expectation 
from local mutation rates (as are single-base 
substitutions; table S1). It is known that 
meiotic breaks lead to instability of mini- 
satellites near telomeres (57). However, to the 
best of our knowledge, a higher mutation rate 
in repair of telomere-proximal meiotic breaks 
in general, including those in unique DNA, has 
not previously been reported. 

Next we assess indels in tandem repeats 
(TRs). These have a different composition rel- 
ative to those in unique DNA, both inside and 
outside hotspots. Insertion and deletion rates 
are similar outside hotspots, consistent with 
evolutionary stability of tandem repeat (TR) 
sequences on average (fig. S5A). However, in- 
side hotspots, there is an approximately twofold 
bias toward insertions relative to deletions (fig. 
S5A) (25). Indels in hotspots are larger than 
those outside (fig. S5, B to E) and are usually 
multiples of the TR unit, particularly for in- 
sertions (fig. S5, F and G). Their rates are cor- 
related with hotspot intensities and modulated 
by context (table S1), including proximity to telo- 
meres (table S1) (57). Analysis of mutations in 
transposable elements provides further con- 
firmation that elevated mutagenesis is due to 
the recombination machinery per se (as opposed 
to, e.g., sequence composition) (fig. S6) (47). 

We conclude that bias toward insertions 
and higher mutation rates near telomeres are 
a consistent feature of mutations arising from 
meiotic breaks in the autosomes. 


Disease impacts of mutations resulting from 
repair of meiotic breaks 


Exons are overrepresented near hotspots for 
the PRDM9 alleles mapped in humans (Fig. 54 
and fig. S7, A and B), a phenomenon also seen 
in mouse hotspots (38). Hotspots of the human 
A allele analyzed here overlap exons in 3486 
genes. Therefore, we sought to assess the health 
impacts of mutations arising from meiotic 
breaks through gnomAD and also ClinVar, a 
large-scale database of polymorphisms with 
evidence regarding their clinical significance 
(52). First, we examined how often these muta- 
tions lead to disruption of genes in the pop- 
ulation through gnomAD. These data exclude 
individuals with severe pediatric disease and 
their first-degree relatives and may represent 
an underestimate of the overall impact of muta- 
tions. To provide a complementary viewpoint, 
we assessed pathogenic variation in ClinVar. 
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The number of gnomAD predicted loss of 
function (pLOF) mutations in hotspots is 75% 
(95% CI = 71 to 80%) higher for indels and 67% 
(95% CI = 63 to 71%) higher for single-base 
substitutions than expected from the propor- 
tion of DNA sequence in hotspots (41). To assess 
whether this is driven solely by the increased 
mutation rates established above, we compared 
the fraction of mutations that lead to pLOF 
in hotspots with that in the genome. We found 
that a single-base substitution in a hotspot is 
38% (95% CI = 35 to 42%) more likely to be 
pLOF than one elsewhere. For indels, the cor- 
responding value is 28% (95% CI = 25 to 31%). 
These impacts are consistent with the over- 
representation of hotspots near exons (41). 

For a mutation, we can estimate the proba- 
bility that it arose from meiotic break repair 
based on its distance from a hotspot center 
and the overrepresentation of that variant type 


in hotspots (fig. S7, C and D). Here, we report ° 


pLOF mutations that are most likely to have 
arisen as a result of programmed breaks (average 
P = 0.80; see table S2 for additional variants). 

We observed 206 SVs and 77 indels disrupt- 
ing 278 genes that meet these criteria. Among 
them, 40 genes have multiple pLOF variants 
attributable to meiotic DSBs (31 SVs affect 
more than one gene) (table S2). These genes 
are linked with a range of X-linked and auto- 
somal disorders, e.g., 7MLHE (X-linked autism), 
CDKL5 (developmental encephalopathy), KANCD2 
(Fanconi anemia), FLT4 (congenital heart de- 
fects), FTCD (glutamate formiminotransferase 
deficiency), and DOCK8 (DOCK8 immuno- 
deficiency syndrome) (Fig. 5B, fig. S7E, and 
table S2). To the best of our knowledge, of these 
278 genes, mutations in only SHOX, the VCX 
gene family, and PRDM9 itself have previ- 
ously been attributed to meiotic break repair 
(24, 36, 53, 54). 

Genes in ClinVar have received variable de- 
grees of investigation. To prevent confounding 
for this reason or biological factors such as a 
difference in tolerance for mutations, we took 
a stringent approach by comparing patho- 
genic polymorphisms in exons that overlap 
hotspots with other exons of the same gene 
(41). Specifically, we investigated multiexonic 
genes with at least one pathogenic exonic poly- 
morphism (7 = 1298). We found that hotspot- 
overlapping exonic regions contain 41% more 
pathogenic mutations per base than nonover- 
lapping ones on average [95% CI = 11 to 80%, 
P =5 x 10“ (4D). The impact on exonic re- 
gions closer to hotspot centers is even higher 
[with nearly double the rate of pathogenic mu- 
tations +100 bp from hotspot centers (47)]. 

We identified 81 genes that have hotspot- 
overlapping exons with statistically signifi- 
cant increases in pathogenic mutations after 
Bonferroni correction for multiple testing (Fig. 
5C and table S3). These genes include HEXA 
(Tay-Sachs disease), CDKNIB (neoplasia), GATA1 
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Fig. 3. Footprints of extremely rare indel and SV breakpoints relative to 
human hotspots. (A) Fold excess in the number of indel breakpoints per DSB 
(allele frequency <10~°) in and around autosomal hotspots (100-bp moving 
window). Indels overlapping Alu elements are not included and are shown separately 
in fig. S6. Both breakpoints are included. (B) Same as (A), but with a magnified view 
of 1-bp insertions and deletions (20-bp moving window). The PRDM9-binding site 


(thrombocytopenia and thalassemia), and 
SH2D1A (lymphoproliferative syndrome). 

Collectively, these data establish that mei- 
otic breaks are a previously underrecognized 
cause of human disease. 


Footprints of single-base substitutions 
implicate translesion DNA polymerases in 
meiotic break repair 


Many exogenous and endogenous factors that 
affect DNA have characteristic “mutational 
signatures.” These signatures have proved power- 
ful in understanding the molecular processes 
driving many cancers (55). For example, the 
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trinucleotide context of mutated bases, i.e., the 
bases immediately upstream and downstream 
of the mutated base, can help to distinguish their 
underlying causes. 

We therefore assessed the trinucleotide muta- 
tional signature of the central and off-center 
peaks of single-base substitutions identified 
above (footprints 1 and 2; Fig. 2, B and G, 
respectively). We observed significant varia- 
bility in mutation rates depending on the 
trinucleotide context in both footprints (Fig. 6, 
A and B, and fig. S8, A and B). In addition to 
single-base substitutions, the central peak muta- 
tional signature includes 1-bp indels (Fig. 3B). 
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is highlighted. The motif may be present in the orientation shown or its reverse 
complement. (C) Fold excess in the number of SV breakpoints (allele frequency <10™) 
per DSB detected with long-read sequencing in an Icelandic population (deCODE-SV) 
relative to autosomal hotspots (100-bp moving window). The hotspot proximal breakpoint is 
shown. See fig. S3 for the hotspot-distal breakpoint and data from multiple populations 
(gnomAD-SV). (D) Same as (C), but for the X chromosome (100-bp moving window). 


None of the known mutational signatures in- 
ferred from cancer genomes (55) is a good fit 
for this signature. One possible mechanism is a 
noncanonical pathway for processing meiotic 
breaks that could enable repair through end 
joining (fig. S8C) (56, 57). 

In the off-center peaks, which reflect the 
regions typically outside the D-loop (Fig. 1A), 
the trinucleotide context of C>G mutations 
(Fig. 6B) is consistent with preferences of AID/ 
APOBEC cytosine deaminases, which are known 
DNA mutators (58, 59). C>T mutations in these 
peaks are strongly elevated in a CpG context 
(Fig. 6B), consistent with both spontaneous and 
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of hotspot centers (blue) or 8 to 10 kb from it (gray, rescaled to the 
average number per 200 bp to facilitate comparison). See fig. S4A for the 
full size scale. (©) Same as (B), but for deletions. See fig. S4C for the full 
size scale. 


enzymatic deamination in ssDNA (49). These 
data indicate that DNA outside the D-loop 
accrues more cytosine deamination than DNA 
within it, which is subsequently repaired 
incorrectly. 

Translesion synthesis (TLS) DNA polymerases 
such as Revl, Poln, and Pol¢ are strong can- 
didates for effecting this repair (fig. S9). TLS 
can lead to C>G, C>T, and A>G mutations 
(19, 49, 60-62), whereas a nonexclusive pos- 
sibility for C>T mutations is replication after 
cytosine deamination (49). C>G mutations are 
a telltale signature of REV1 (49), and repair by 
mismatch-repair machinery leveraging Poln can 
give rise to A>G mutations (63). Our analysis of 
published single-cell RNA-sequencing data from 
mouse testes (64) shows that several TLS poly- 
merases are highly expressed at the relevant 
time frame in meiosis (fig. S9, A to F). TLS 
involvement in yeast meiosis is indicated by 
two-hybrid associations (65) and they mediate 
cell cycle-dependent repair stimulated by RAD51 
in somatic cells (66-68). Our findings thus sug- 
gest that TLS polymerases, potentially mediated 
by RAD51 (Fig. 2F), are involved in filling the 
gap that remains in resected DNA distal to the 
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break site after HR. A nonexclusive possibility 
is reduced efficiency of mismatch repair in the 
region outside the D-loop (49). 

In addition to the off-center peaks, C>G and 
A>G mutations exhibit long-range (>10 kb) 
strand-asymmetric mutations (Fig. 6, C and D). 
This signature is consistent with TLS in the 
context of break-induced replication, a distinct 
and highly error-prone repair pathway that gen- 
erates long tracts of ssDNA through a migrating 
D-loop (Fig. 1A) (19, 69). 


Sequence features of indels implicate 
template switching and microhomology-mediated 
end joining in meiotic break repair 

To understand the mechanisms generating short 
insertions, we compared each inserted sequence 
in unique DNA with its flanking sequences. We 
observed that 82% of bases matched, on average, 
between the inserted sequence and the more 
similar of its right- and left-flanking sequences, 
which would not be expected by chance (P < 
2 x 1071°; Fig. 7A). The canonical model for 
generating insertions is “polymerase slippage” 
(27, 70). Under this model, the polymerase 
performing DNA synthesis for break repair 
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disassociates with its template and subsequently 
reattaches to a segment already synthesized, 
which leads to a duplication. Mismatches from 
the template (i.e., an insertion that is non- 
identical) could be due to random polymerase 
errors or to copying from an incorrect tem- 
plate (19). However, neither random poly- 
merase errors nor copying from a random 
template are able to explain the observed data, 
alone or in combination, especially for inser- 
tions longer than ~10 bp (Fig. 7B and fig. S10, 
Ato C) (41). 

We reasoned that the pattern of mismatches 
could be explained if some insertions are gen- 
erated by successively copying from multiple 
templates such as the correct and one or more 
incorrect templates (42), a phenomenon known 
as “template switching” (79). To capture this, 
we modeled insertions as arising from tem- 
plates with varying degrees of similarity with 
the correct template while also allowing for 
random polymerase errors. We used Markov 
chain Monte Carlo (MCMC) to sample template 
properties and polymerase error rates consist- 
ent with the data and found that this model 
captures the observed distribution of mismatches 
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Fig. 5. Disease impacts of mutations resulting from meiotic breaks. 

(A) Enrichment of exons near hotspots. For each hotspot, exons within 100 kb 
were included. Each base pair in an exon counts toward the total at the 
corresponding distance from each hotspot center (3-kb moving window). The 
95% confidence intervals are shown in gray. (B) Examples of predicted loss- 
of-function SVs affecting genes associated with disease. Gene bodies (black 
lines), hotspots (horizontal green lines), insertions (blue arcs joining start and 
end points), and deletions (red arcs joining start and end points). SVs with 
breakpoints in hotspots are shown with thicker arcs. Genes (and associated 
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diseases) shown are: TMLHE (autism X-linked) and CDKL5 (developmental 
and epileptic encephalopathy and atypical Rett syndrome), FTCD (glutamate 
formiminotransferase deficiency), FANCD2 (Fanconi anemia), and FLT4 
(lymphatic malformation and congenital heart defects). (C) Same as (B), 

but for ClinVar data. Exons (thick black lines), hotspots (horizontal green lines), 
and reported pathogenic mutations are shown, which are single-base substitutions 
(black), deletions (red), insertions (blue), and other or unspecified (gray). 
Genes shown are GATAI (thrombocytopenia and thalassemia) and CDKNIB 
(neoplasia). 
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Fig. 6. Mutational signatures of single-base substitutions in hotspots. 

(A) Trinucleotide mutational signature showing the number of mutations observed 
per base in the central region of autosomal hotspots (+50 bp from the center 

of the PRDM9-binding motif) after correcting for the local background rate. The 
rates shown include the reverse complement. A positive value implies a higher 
rate inside hotspots than background and vice versa. The figure corrects for 
differences in sequence composition in and around hotspots. (B) Same as (A), but 
for the off-center peaks in X chromosome hotspots. Th 


(Fig. 7B and fig. S10, A to D). Under this model, 
we inferred that 63% (95% CI = 60 to 66%) of 
insertions are generated by copying the same 
template more than once (i.e., side-by-side dup- 
lications) (fig. SIOE) with a polymerase error 
rate of 1.2% (95% CI = 1.0 to 1.5%) (fig. S10F), 
which is consistent with properties of TLS 
polymerases (68, 77). The remaining 37% of 
insertions are combinations of homologous 
and nonhomologous sequence, consistent with 
template switching (fig. SIOE). A tendency to 
fall off their template sequence after incor- 
porating only a small number of nucleotides, 
i.e., low processivity, is another hallmark of 
TLS polymerases (68). 

End-joining repair pathways that ligate DNA 
on either side of the break can lead to deletions 
(72). Although these pathways are thought to be 
suppressed in meiosis, we hypothesized that end 
joining is used as a backup mechanism for sites 
that remain partially or entirely unrepaired at a 
critical stage (late pachytene). Work in mitotic 
cells, HR-deficient cancers, and in vivo in worm 
provides evidence that the microhomology- 


mediated end-joining (MMEJ) pathway medi- 
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é trinucleotide context of 


ated by DNA polymerase 0 (also known as 
theta-mediated end joining or TMEJ for short) 
can repair resected break sites previously 
occupied by HR proteins (72-74). In meiosis, 
a switch from HR to MMEJ at unrepaired pro- 
grammed DSB sites would lead to germline 
mutations. In addition to deletions, this pathway 
can generate insertions, e.g., when DNA next to 
the break site has already been extended, as 
discussed above (Fig. 7, A and B). Accordingly, 
we investigated whether sequences at deletion 
and insertion breakpoints show evidence for 
microhomology (we restricted the analysis of 
insertions to side-by-side duplications because 
the template can be identified in those cases). 

For autosomal indels that have both break- 
points in unique DNA, we examined the inserted 
and deleted sequences for microhomology. The 
vast majority of indels in hotspots showed sig- 
nificant microhomology at the breakpoint, 
which would not be expected by chance (fig. 
S11A; P< 2 x 107"). Because other end-joining 
pathways can also exhibit some microhomology 
(usually between 0 and 3 bp for nonhomologous 
end joining or NHEJ for short) [reviewed in 
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C>G changes resembles the preferences of the AID/APOBEC family, particularly 
AID, APOBEC3F, and APOBEC3G (58, 59). APOBEC3F is strongly expressed in 
human testes (59). (C) Fold excess in the number of extremely rare C>G and 
G>C mutations per C and G base per DSB, respectively, in and around autosomal 
hotspots (n = 25,440; 1-kb moving window). The figure corrects for variation in 
sequence composition. Arrows highlight the long-range excess of strand- 
asymmetric mutations away from hotspot centers. (D) Same as (C), but for 
A>G and T>C mutations. 


(72, 75)], we compared the properties of hotspot 
indels with those in the local background. Indels 
in hotspots exhibit significantly greater micro- 
homology than those outside (Fig. 7, C and D). 
Microhomologies in insertions and deletions in 
hotspots are similar to each other (fig. S11, B and 
C), which is consistent with them arising from a 
shared etiology. Microhomologies range mainly 
between 1 and 10 bp, but are sometimes higher, 
consistent with known properties of MMEJ 
(72, 76). 

A model that is consistent with these data 
allows for deletions to arise through MMEJ be- 
tween resected sites flanking DSBs and insertions 
through MMEJ between sites where some DNA 
resynthesis has already taken place (fig. S11D). 
Pol, which is absent in yeast, is expressed in 
only a few tissues in human and mouse, with 
highest expression in testis (77). Furthermore, 
our analysis of single-cell RNA-sequencing data 
from mouse testes (64) shows that Pol@ and 
Lig3, the major ligase mediating MMEJ (72), 
are highly expressed during the relevant time 
frame in meiosis, i.e., pachytene (fig. S11, E and F). 
Collectively, these data strongly suggest that 
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Fig. 7. Mechanisms underlying insertions and deletions in hotspots. 

(A) Sequence homology between the inserted sequence and the more similar 
of its two flanking sequences (blue) for insertions with both breakpoints in 
unique DNA (n = 12,029; mean = 0.82). The homology between the flanking 
sequences themselves (green) is shown as control. Error bars indicate 2 SEs for 
the estimate of the mean. (B) The proportion of inserted sequences that are a 
perfect match to their best-match flanking sequence (y axis) are shown relative 
to the insertion length (x axis). We fitted and tested three models for generating 
mismatches: random polymerase errors (purple, “Polymerase error only”), 
copying the adjacent sequence or a random sequence from the genome, with 


(Distance of base from the end of the template sequence) 


template homology and probability of polymerase errors were inferred from 

the data under this model (see also fig. S10). (C) The proportion of deleted 
sequences wherein the base at a given position matches the base at the 
corresponding position in the flanking DNA sequence for deletions arising from 
meiotic breaks (red) relative to those in local background (gray). The average 
microhomology across sequences inside a hotspot is the weighted mean of 
microhomology due to background processes and that due to meiotic break 
repair. We infer the meiosis-specific signal by subtracting out the background 
signal. Error bars indicate 2 SEMs. Sequences =5 bp in length with a breakpoint 
within the PRDM9-binding motif were included. (D) Same as (C), but for 


polymerase errors (green, “Duplication or random”), copying the adjacent 
sequence or a sequence with varying degrees of homology with it, with 
ing”). The distribution of 


polymerase errors (blue, “Model with template switch 


MMEJ is a major mutagenic force in the hu- 
man germ line. 


Provenance of SVs in hotspots 


The canonical model for SVs associated with 
the recombination machinery is nonallelic 
homologous recombination (NAHR), which 
posits that SVs are generated through ectopic 
pairing between large DNA segments with 
high sequence similarity (28). Although it is 
a good fit for several mutations that underlie 
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duplications, which are co 


specific genomic disorders, NAHR cannot explain 
the overall pattern of SVs that we have observed 
(Fig. 3C and fig. S12A). The NAHR model predicts 
an excess of deletions over insertions (specifically 
duplications) (28). By contrast, we have shown 
a strong excess of insertions over deletions in 
autosomal hotspots (Fig. 3C and fig. S12A). 
Detailed sequence and breakpoint analysis 
of several complex SVs in autosomal hotspots 
revealed features observed in short indels above, 
specifically template switching and signatures of 
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mpared with the DNA sequence flanking the inferred 


template. Insertions that were not perfect duplications were excluded due to 
uncertainty in identifying the correct position for comparison. 


MME) (fig. $13). This suggests that mechanisms 
underlying short indels, ie., annealing of ex- 
tended or unextended ssDNA flanking the 
break site, may also explain many SVs. If this 
hypothesis is correct, then we would expect an 
excess of SVs in a size range within the extent 
of resection (~2 kb). Comparison of SV sizes 
confirms that this is the case for both inser- 
tions and deletions (P = 5 x 10 *” for deletions, 
P=10 * for insertions; figs. S3B and S12, B to G) 
(41). The preponderance of SVs smaller than 
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Fig. 8. Provenance of SVs in hotspots. (A) Sequence context of autosomal 
hotspot centers (left) and hotspot-proximal SV breakpoints in deCODE-SV for 
SVs <2 kb (middle) and 22 kb (right). The contexts/repeat families shown are 
unique DNA (pink), Alu (green), L1 (yellow), L2 (blue), TR (orange), and others 
(gray). SVs with breakpoints within 100 bp of hotspot centers were included. 
(B) Histogram of the maximal error-free microhomology between the deleted 
sequence and its flanking sequence for SV deletions <2 kb that had neither 
breakpoint in a TR sequence (n = 106) and had a breakpoint within 100 bp of a 


Position on chromosome 3 (Mb) 


Position in flanking sequence (bp) 


hotspot center. Complex events (i.e., deletions with accompanying insertions, c 
n = 16) were excluded due to uncertainty in identifying the correct position for 
comparison. (C) Same as (A), but showing the context of X chromosome hotspot 
motif centers (left) and hotspot-proximal breakpoints (right) for SV deletions 

(all sizes). (D) Same as Fig. 7C, but for the X chromosome. (E) Subtelomeric = « 
locus with SV deletions (red arcs) between hotspot centers (within 200 bp from 
the PRDM9 motif midpoint). Hotspots (horizontal green lines) and the CHL1 gene 
body (black line) are shown. 


2 kb with breakpoints in TRs (Fig. 8A) is con- 
sistent with our model (fig. S11D). Among non- 
TR SV deletions and duplications smaller than 
2 kb, microhomology was observed in 84 and 94% 
of events, respectively, with median microho- 
mologies of 10 and 15 bp, respectively (Fig. 8B 
and fig. S12H) (47). We consider mechanisms 
underlying the small proportion of hotspot SVs 
that are >2 kb in (42 (fig. $12, I and J). 
Finally, we examine the abundance of SV 
deletions in X chromosome hotspots (Fig. 3D). 
In addition to being more numerous per base 
pair, they are systematically larger: Their aver- 
age size (5.4 kb) is twice that of autosomal 
hotspots (2.7 kb), with 34% being >2 kb 
(compared with 12% on the autosomes; P = 
2 x 1077; fig. S12K). Although the number of 
SV deletions is correlated with hotspot inten- 
sity and background mutation rate on both 
the X chromosome and the autosomes, their 
relative impacts are distinct: Hotspot intensity 
is a stronger predictor on the X chromosome, 
whereas background mutation rate is a stron- 
ger predictor on the autosomes (47). Consistent 
with this, the proportion of SV deletions in 
unique DNA in X chromosome hotspots re- 
sembles the proportion of hotspots themselves 
(Fig. 8C). The median microhomology in non- 
TR SV deletions in X chromosome hotspots is 
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1 bp, which is lower than that on the auto- 
somes (P = 2 x 10“) and similar to expect- 
ations from NHEJ. Microhomology in short 
deletions in X chromosome hotspots is also 
lower than autosomal hotspots and similar to 
deletions outside hotspots (Fig. 8D). 
Collectively, these analyses suggest reduced 
use of microhomology-mediated repair on the 
X and higher impact of a process such as NHEJ, 
which directly ligates DNA ends. Why might X 
chromosome breaks be repaired differently 
from those on the autosomes? The X chromo- 
some lacks a homolog in male meiosis, and 
regulation and repair of breaks thereon differ 
from the autosomes in several respects (78, 79). 
Although the extensive DNA resection that 
accompanies meiotic breaks is expected to dis- 
favor NHEJ (75), it is possible that some X 
chromosome breaks are processed without DNA 
resection, e.g., through the alternative break-end 
processing mechanism discussed above (fig. 
S8C) (11, 56, 57). Another possibility is that 
resected DNA is filled in before NHEJ (80, 87). 
The X chromosome SVs that we observe could 
be due to NHEJ between sites of programmed 
breaks or between sites of programmed and 
sporadic breaks. The first of these possibilities 
can be tested with the present data and, al- 
though uncommon, we find significant evi- 
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dence for end joining between hotspot centers 
[P = 0.009, polarization test (47)]. These events 
are predominantly near telomeres (Fig. 8E, 
nine of 11 are within 2 Mb of chromosome 
ends; P = 4 x 10°”) and are overrepresented 
on the X chromosome (three of 11; P = 0.008). « 
Subtelomeric hotspots have an intense burden ° 
of programmed breaks in male meiosis and 
exhibit distinct kinetics of repair (82, 83). Our 
analyses suggest that, as with the X chromo- 
some, they may rely on otherwise disfavored 
pathways to repair some of them. 


Discussion 


The induction and repair of hundreds of pro- 
grammed DNA DSBs is a central part of the 
creation of eggs and sperm. Despite their multi- 
generational impacts on human health and 
diversity, an understanding of errors in these 
processes has been hampered because they are 
individually very rare. 

Here, we have shown that they are collectively 
common: One in four sperm and one in 12 eggs 
has a de novo mutation specifically caused by 
meiotic breaks. We demonstrate that the pre- 
viously reported link between de novo single- 
base substitutions and crossovers is only the 
tip of the iceberg, with the overall burden due 
to meiotic breaks being almost an order of 
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magnitude higher. These data further show that 
the mutation rate per break is about threefold 
higher in paternal relative to maternal meiosis. 
This is comparable to the genome-wide differ- 
ence in the number of DNMs inherited from 
fathers relative to mothers. Recent work pro- 
vides evidence that the higher mutation rate 
in males is driven primarily by differences in 
the balance between DNA damage and repair in 
males and females (84). If the lower accuracy of 
repair that we have observed for meiotic breaks 
in males holds across germline breaks more gen- 
erally, then it could explain the higher rate of 
paternally inherited mutations genome wide. 

In addition to single-base substitutions, we 
find that DSBs lead to mutation rates 100 to 
1300 times higher per break for indels and SVs 
than would be expected in those regions in the 
absence of a break. These rates are affected by 
the size, nature, and context of the mutations, 
with SVs biased toward insertions in the auto- 
somes and deletions on the X chromosome. 
These findings are consistent with previous 
work where the latter exists: namely, excess 
of DNMs near crossovers (7), excess of rare 
SNPs in hotspots (20, 23), and TR instability 
(24, 25). Despite <1% of potential hotspot sites 
undergoing a break in any given meiosis, the 
high mutation rates per break make hotspots 
a significant force in germline mutagenesis. 

We provide multiple lines of evidence that a 
repertoire of error-prone DNA repair mecha- 
nisms, e.g., translesion synthesis, microhomology- 
mediated end joining, break-induced replication, 
and NHEJ are involved in human meiotic break 
repair. We find that, for single-base substitutions, 
distinct mechanisms are at play at the hotspot 
center and inside and outside the D-loop. Most 
autosomal indels and SVs arising from meiotic 
breaks show evidence for microhomology- 
mediated end-joining. By contrast, our analyses 
show that the canonical model for generating 
SVs, i.e., nonallelic homologous recombination, 
cannot explain them. It is surprising that 
many of these pathways, which are normally 
associated with repair in somatic cells (espe- 
cially cancer cells) and mutant organisms 
(11, 18, 19, 48, 49, 56, 57, 60, 65, 69, 72, 73), 
are active in response to programmed breaks 
in human germ lines at large. 

Comparison between the autosomes and 
the X chromosome suggests that whereas micro- 
homology-mediated end joining is mutagenic, 
it protects against larger and potentially even 
more deleterious mutations. The mutation 
burden is particularly high near telomeres and 
on the X chromosome, both of which face 
specific challenges in male meiosis (78, 82). 
Collectively, these data suggest that many meio- 
tic mutations accrue at times of stress, e.g., when 
breaks cannot be repaired with the intended 
pathway or within the requisite time frame. 

Misrepair of meiotic breaks is thereby a cause 
of disease, with 41% increase, on average, in the 
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rate of pathogenic mutations in exonic regions 
overlapping hotspots genome wide. Our analy- 
ses have identified 81 genes with significantly 
higher pathogenic mutations due to this pro- 
cess. Furthermore, we have identified 278 genes 
with loss-of-function mutations attributable to 
meiotic breaks (five genes overlapped in these 
lists). These 354: genes are involved in a range of 
developmental disorders and cancers and, to 
the best of our knowledge, only three of them 
have previously been linked with mutations gen- 
erated as a consequence of meiotic break repair. 

Sexual reproduction requires homologous chro- 
mosomes to pair up to resegregate into haploid 
gametes. Adaptation of the tools of DNA repair 
to achieve this challenging task lies at the heart 
of meiosis, and multiple aspects of this process 
are conserved from yeast to human (J, 85). The 
meiotic program must thus find an equilibrium 
between the risk of infertility due to insufficient 
breaks (86) and the cost of pathogenic muta- 
tions. Our analyses show that this cost is con- 
siderably more severe than had been suspected. 

The overrepresentation of programmed breaks 
near exons in humans and at transcription start 
sites in many species is therefore surprising and 
suggests that there is an evolutionary benefit to 
positioning breaks near genes. It is possible that 
the chromatin environment in these regions pro- 
motes repair, thereby increasing the likelihood of 
successful chromosome pairing (87). If this is the 
case, then this would imply that the increase in 
fertility afforded by this strategy outweighs the 
burden of genetic disease from misrepair of DSBs. 

The evolutionary cost of incorrectly repaired 
breaks is predicted to be particularly acute for 
the sex chromosomes, in which lower effective 
population sizes and reduced crossing over im- 
ply that efficiency of natural selection will be 
lower. Concentration of breaks toward telomeres 
in males and lower gene density on the sex chro- 
mosomes may in part reflect an evolutionary 
response to the mutation burden of DSBs. Ex- 
tensive DNA resection accompanying breaks, al- 
though incurring a clear mutational cost, likely 
contributes to correct chromosome pairing and 
safeguards against more catastrophic genome 
instability. The mechanisms underlying meiotic 
recombination thus perform a delicate evolution- 
ary balancing act between the benefit of sexual 
reproduction and the burden of genetic disease. 


Materials and methods 
Hotspot calling 


We used chromatin immunoprecipitation- 
sequencing (ChIP-seq) data for ssDNA bound 
to DMCI1, which was measured in testes of a 
human male homozygous for the A allele and 
one heterozygous for the C and 14 alleles 
(20, 39). Hotspots were called and their DMC1 
intensities were estimated using our peak- 
calling methodology (38). We identified the 
most likely PRDM9-binding site within each 
hotspot using a motif-calling algorithm (88) and 
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defined its midpoint to be the hotspot center. 
One AA hotspot (out of 28,286) with an intensity 
estimate that was a large outlier was excluded 
from analyses involving hotspot intensities. 


Estimating the full burden of de novo 
single-base substitutions in human 
recombination hotspots 


We used published data of DNMs and crossovers 
identified in 2976 Icelandic trios (7). Only a 
subset of programmed meiotic breaks are 
repaired with a crossover. Detailed methodol- 
ogy for inferring the burden of DNMs due to 
meiotic breaks, including those not associated 
with a crossover, is provided in (47). 


Footprints and rates of single-base 
substitutions, indels, and SVs in hotspots 


We used the Gnomad v3.0 dataset (45) and only 
used variants that passed all gnomAD filters. In : 
addition, we restricted our analysis to variants 
that had a positive variant quality score 
(AS_VQSLOD > 0). This included 368 million 
SNPs and 64 million indel calls. We then fil- 
tered for variants with allele frequency <10~°, 
because extremely rare SNPs are recent enough 
for the impact of selection and meiotic drive to 
be small and have proven to be a powerful 
source for research in human mutation (23). This 
provided 341 million SNPs and _ 56 million indels. 

In the plots for base-specific single-base 
substitutions, we corrected for differences in 
sequence composition with a base-by-base nor- 
malization. For example, in the case of C>T 
mutations, we divided the number of extremely 
rare C>T SNPs at each position (relative to the 
hotspot center) with the number of times a C 
base was observed at that position in the refer- 
ence genome. The same approach was extended 
to mutations within each trinucleotide context. 
Some sites may have experienced more than one 
independent mutation in the genealogical his- 
tory of the individuals in the gnomAD sample 
set (89). Because each site is reported only once 
per mutation type in gnomAD, it is possible 
that we underestimate the mutation burden 
for the most strongly enriched mutation types, 
e.g., CpG>TpG in the off-center peaks. 

We used SV calls from the Icelandic pop- 
ulation (deCODE-SV) (24) and the gnomAD 
populations (gnomAD-SV version 2.1) with allele 
frequency <10~? (50). 

We inferred the per-DSB fold excess in 
mutations as follows. Consider a hotspot and let 
B be the event of a break in it in a meiosis and 
M be the event that it incurs a mutation ata 
specific position in that meiosis. 


P(M) = P(M|B)P(B) + P(M|B’)P(B’) 


Let the background rate P(/|B’) be 7p. Be- 
cause P(B) « 1, 


P(M|B)=(P(M) — reg) /P(B) 
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We wish to calculate the mutation rate per 
break averaged across n hotspots and m mei- 
osis in the sample, i-e., 


The probability of a break in a hotspot in 
individual meioses in the genealogical history 
of a sample (P(B,)) is not known. However, we 
have estimated the average probability of a 
break in an A-allele hotspot from a present-day 
Icelandic sample (0.8907%), as described in (41). 
Assuming that P(By)* ug = A, >>>) P(By); 


i=1j=1 


22) PM LBs) x 


i=1 j=l 


1 1 
mn ij My) mn i jar” 


Hg 


We calculated the background rate of muta- 
tions as the number of mutations per base in 
the regions 5 to 10 kb from hotspot centers, 
excluding regions that overlap with another 
nearby hotspot. The number of meioses in the 
sample is unknown, so we are able to infer only 
the fold excess, which is reported. 


1 m n 
win ij 1P(MalBi) 


1 m n 
eas jar Bi 
m n 
1 ae jv Ma) 
m n 
0.008907 ee” 


Note that this estimate is conservative for 
gnomAD because a significant proportion of 
populations included in the gnomAD dataset 
have PRDM9 alleles with binding properties 
distinct from those of the A alelle. 

In the plots of the per-DSB excess in indel and 
SV mutation rates, we have used both break- 
points unless otherwise specified. In specific 
plots, the hotspot-proximal and the hotspot- 
distal breakpoints were identified by comparing 
their respective distances to the closest hotspot 
center and are shown separately. 

To calculate point estimates of the elevation 
of mutations per break in hotspots, we used 
counts of indels and SV breakpoints in the 
central 100 bp of hotspots because DNA DSBs 
are concentrated mainly in this region in mouse 
hotspots (46). The 95% CIs were estimated using 
bootstrap (10,000 bootstrap samples for SVs and 
2000 for indels). 


Repetitive DNA 


The repeat context of indel breakpoints was iden- 
tified using the RepeatMasker track for build 38 
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downloaded from the UCSC Genome Browser at 
https://genome.ucsc.edu/cgi-bin/hgTables. 

TR annotations were downloaded using the 
simpleRepeat track from the UCSC Genome 
Browser, which is based on Tandem Repeats 
Finder (TRF) (90). TRs are defined as “two or 
more adjacent, approximate copies of a pat- 
tern of nucleotides” (https://tandem.bu.edu/ 
trf/trf.html) and include microsatellites, mini- 
satellites, and other TRs with a period size 
between 1 bp and 2 kb. In cases where the 
RepeatMasker and simpleRepeat track both 
annotated the same sequence, we used the 
TR annotation. 


Modeling insertions 


Only indels with both breakpoints in the same 
context were used in context-specific analyses. 
Indels for which neither breakpoint overlapped 
a RepeatMaster or TR sequence were deemed 
to be in “unique DNA.” 

For specific analyses measuring indel homol- 
ogy and microhomology in unique DNA (Fig. 7), 
we performed more stringent filtering to avoid 
overestimating (micro)homology from indels in 
repetitive DNA that may have escaped the filters 
above. In the event of multiple equivalent inser- 
tion or deletion positions, as is the case for many 
indels in TRs and homopolymer runs, the first 
of those positions is reported in gnomAD. There- 
fore, we filtered out any sites with more than 
one insertion (or deletion) in these analyses as a 
further check against inclusion of homopolymer 
or TR sequences. 

For perfect side-by-side duplications, it is not 
possible to determine the true breakpoint. For 
a duplication of size n, the true insertion point 
can be any one of n + 1 possible sites: imme- 
diately upstream or downstream of the dup- 
licated sequence or anywhere in between. The 
first of these positions is reported in gnomAD, 
as mentioned above. We also use this repre- 
sentation, without loss of generality. Imperfect 
duplications can also have multiple repre- 
sentations. In such cases, for consistency, we 
chose the representation that maximized 
homology with the right-flanking sequence, 
which was almost always the one reported in 
gnomAD (98%). We excluded insertions where 
the findings (e.g., microhomology) were dif- 
ferent for these different representations to 
avoid biasing the results. This was also fre- 
quently the case in complex SVs where the 
inserted sequence showed homology with 
more than one locus. 

We modeled the provenance of insertions 
using MCMC. Detailed methodology for in- 
ferring the provenance of insertions is pro- 
vided in (41). 


Gene annotations 


Gene and exon annotations were downloaded 
from Gencode (v42). We restricted our analysis 
to protein-coding genes and used Ensembl 
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canonical transcripts to define gene and exon 
boundaries. 


Estimating the probability that a variant 
emerged as a consequence of meiotic 
break repair 


Consider the fold enrichment fin the number 
of SV and indel breakpoints at a given distance d 
from the hotspot center. We infer that the prob- 
ability that a variant, which has its hotspot- 
proximal breakpoint at this distance, arose due 
to recombination is (f - 1)/f. We calculated 
this value for each base pair distance d from 
hotspot centers on average (fig. S7, C and D). 
We assumed that f decreases monotonically 
with distance from the hotspot center (this 
assumption is well supported by the data; Fig. 3), 
but that the data are noisy. Therefore, we fitted 
a piecewise-constant monotonic function to 
the data with a node point at every d. The sum 
of squares of the deviation between the data 
and regression function was minimized, which 
is equivalent to maximum likelihood estima- 
tion under an assumption that errors are nor- 
mally distributed. The resultant quadratic 
program was solved in R using the quadprog 
package. 


ClinVar data 


We downloaded the ClinVar data on 19 
January 2023. We restricted our analysis to 
variants that had the ClinSigSimple field set to 
1, flagging pathogenic and likely pathogenic 
variants. 
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Time-resolved crystallography captures light-driven 


DNA repair 
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Photolyase is an enzyme that uses light to catalyze DNA repair. To capture the reaction intermediates 
involved in the enzyme’s catalytic cycle, we conducted a time-resolved crystallography experiment. 

We found that photolyase traps the excited state of the active cofactor, flavin adenine dinucleotide 
(FAD), in a highly bent geometry. This excited state performs electron transfer to damaged DNA, 
inducing repair. We show that the repair reaction, which involves the lysis of two covalent bonds, occurs 
through a single-bond intermediate. The transformation of the substrate into product crowds the 
active site and disrupts hydrogen bonds with the enzyme, resulting in stepwise product release, with the 


3' thymine ejected first, followed by the 5’ base. 


hotolyases are enzymes that repair DNA 

lesions induced by solar radiation, using 

light to do so (1). The ancient origins of 

these enzymes suggest that they were 

essential for early organisms to main- 
tain genome integrity. They remain an impor- 
tant DNA repair mechanism in nearly all species 
today (2-4). 

Distinct photolyases have evolved to repair 
the two most common DNA photolesions: cyclo- 
butane pyrimidine dimers (CPDs) and 6-4 
adducts (2-4). CPDs account for ~80% of sunlight- 
induced DNA damage events. These lesions 
consist of two non-native carbon-carbon bonds 
between pyrimidine bases, most commonly 
sequence-adjacent thymines (5-9). CPD photo- 
lyases, which specifically target CPD lesions, break 
these bonds, restoring the bases to their func- 
tional structure (Fig. 1). To do so, they use a 350- 
to 450-nm photon as part of the catalytic cycle, 
making them one of the few known photo- 
enzymes (10). 


Repair of CPDs by photolyase begins with 
photoexcitation of a bound reduced flavin ade- 
nine dinucleotide cofactor (FADH-) (6, 11), either 
through direct photon absorption by FADH- or 
by resonant energy transfer from a second 
“antenna” cofactor that harvests radiation across 
a wider range of the visible spectrum (72). Within 
nanoseconds, the excited state (FADH-*) trans- 
fers an electron to the CPD lesion (7, 13-16). Facil- 
itating electron transfer is key for efficient DNA 
repair. Chemical models have shown that reduc- 
tion of CPD is sufficient to break the pyrimidine 
dimer and produce repaired bases, even in the 
absence of the enzyme active site (6, 17-19). 

Because the FADH-* excited state decays in 
tens of picoseconds in solution, an important 
question is how the enzyme stabilizes this excited 
state so that electron transfer occurs before de- 
excitation. The ratio of electron transfer to de- 
excitation events is a key factor in the overall 
quantum efficiency, which is very high in CPD 
photolyases, with reported values from ~50% 


q 


ture, achieve maximum quantum efficiencies 
of 1 to 5% (6, 17-19). To achieve these high quan- 
tum efficiencies, the enzyme must accommodate 
the transition from the ground to excited state, 
but then trap this state so that the deexcitation 
process is slower than the electron transfer 
time scale. The structure of this trapped inter- 
mediate, however, has to date been unknown. 

Of further interest is the role of the FAD 
binding mode, which is distinct to the photo- 
lyase and cryptochrome family (7-9, 22). The 
enzyme bends FAD into a U-shaped confor- 
mation (Fig. 1A), such that the FAD adenine 
moiety sits between the electron-donating iso- 
alloxazine ring and the electron-accepting CPD. 
Spectroscopic studies have suggested that ade- 
nine mediates the electron transfer (15, 23), but a 
precise accounting of why this U-binding mode 
was selected by evolution is lacking. Structural 
characterization of adenine in the excited state is 
therefore of great interest. 

After electron transfer to the CPD, the carbon- 
carbon bonds that form the nucleobase lesion 
are cleaved (13-15, 24). Currently, the precise 
mechanism of carbon-carbon bond lysis is de- 
bated, especially whether the reaction proceeds 
through an intermediate with a single bond or, 
alternatively, if both bonds break simultaneously 
without crossing an energy barrier (25). 

Following CPD lysis, rapid release of the 
cleaved bases is essential to enable the next 
turnover and maximize the total number of le- 
sions repaired. Product release occurs in ~50 us 
CZ, 9, 26, 27), substantially faster than the dis- 
sociation rate for the enzyme-substrate com- 
plex, which is on the order of seconds to 
minutes. This increased affinity for substrate 
over product is clearly advantageous; how- 
ever, how the enzyme’s structure and dyna- 
mics enable this discrimination has not yet 
been accounted for. 


Time-resolved crystallography targets the 
structural intermediates of photolyase catalysis 


To determine the structure of the excited state, 
capture intermediates populated during CPD 
lysis, and characterize product release, we con- 
ducted time-resolved serial femtosecond crystal- 
lography (TR-SFX) experiments at the SwissFEL 
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Fig. 1. The reaction cycle of DNA photolyase 
captured by time-resolved crystallography. 
(A) Structure of photolyase from M. mazei 
cocrystallized with a dsDNA 14-mer containing a 
CPD. CPDs are DNA lesions caused by exposure 
to UV light. They consist of two pyrimidines— 
here, sequence-adjacent thymines (5'T and 3'T)- 
that are cross-linked by two non-native carbon- 
carbon bonds. The cross-linked thymines, 
incorporated into the DNA backbone by phos- 
phodiester bonds, kink the dsDNA helix. CPD 
photolyases bind DNA at this kink and flip the 
CPD out of the double helix, positioning it 
adjacent to a FAD cofactor. This FAD is active 
when fully reduced (FADH-). Upon photon 
absorption, FADH- can transfer an electron to the 
CPD, inducing lysis in the non-native bonds and 
effecting DNA repair. (B) Region of interest 
showing the CPD binding mode with respect to 
the isoalloxazine and adenine ring systems that 
comprise FAD, demonstrating how adenine sits in 
between the FAD and CPD. Maps shown are 

2MF obs—DF calc at 2.5 o, dark state. (©) Schematic 
of the reaction mechanism, including FAD states 
(left), the corresponding thymine dimer states 
(middle), and the pump-probe delays acquired in 
this study (right). Atom numbers represent indices 


— dark 
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referred to in the main text. The blue line indicates photon absorption (hv) and the orange line, electron transfer (ET), with dashes representing the possibility of electron loss 
resulting in a mixture of states (fig. S16 and S21). At the longest time point studied (100 us), the 5' thymine was only partially released, as indicated by the time axis. 


Fig. 2. FAD butterfly bending accompa- 
nies electronic excitation and 
subsequent electron transfer. (A) Excited 
state of FAD shown superimposed on 

the ground state (3 ps: purple, waters as 
ed spheres; dark state: gray, waters as 
gray spheres). The FAD isoalloxazine ring 
participates in several hydrogen bonding 
interactions in both the dark and 3-ps 
structures. Dashed lines represent hydrogen 
bonding interactions (yellow, present at 3 ps; 
ed, broken between dark state and 3 ps; 
gray, maintained water-toggle hydrogen 
bond). (B) Excitation induces severe butterfly- 
ike bending around the N5-N10 axis of FAD, 
with angles indicated. Maps, extrapolated 
polder MF extr-DF aie OMIT at 5 o. Dark, fully 
educed (FADH-); 3 ps, excited state (FADH-*); 
3 ns, semiquinone (FADH»). (C) The 

same as (A), but a rotated view with the 
isoalloxazine in the plane of the page 

and with the Fobs,3ps—F obs,dark difference 

map superimposed (teal, +4.5 o; orange, -4.5 6). 


free-electron laser (27). We cocrystallized the CPD 
photolyase from the archaea Methanosarcina 
mazet (PL mmCPD), lacking the antenna co- 
factor, with a double-stranded DNA (dsDNA) 
substrate that contained a synthetic cis-syn thy- 
mine dimer (a specific CPD). Prior to crystallization, 
protein was photoreduced and maintained in 
this catalytically active state by using an an- 
aerobic environment. Our crystals formed an 
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dark —3 ps 


orthorhombic lattice with two protein-DNA 
complexes in the asymmetric unit. All results 
here refer to one of these two complexes, which 
is better ordered than its counterpart (fig. S22). 
By using a viscous media injector, a stream of 
microcrystals embedded in a cellulose matrix 
was delivered to the interaction region. DNA 
repair was initiated by a 1.1-ps pulse of 396-nm 
laser light with a peak power of 360 GW cm”. 
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dark —3 ps 
+4.50 /-4.50 


This power, although sufficient to drive non- 
linear processes (27-29), produced interpret- 
able difference map signals (Fopsignt-Fobs,dark)» 
By contrast, in data obtained at 115 GW cm™, 
the same signals could not be readily distin- 
guished from noise (27). 

The resulting dynamics were probed by a 
pulse from the XFEL at time delays ranging from 
3 ps to 100 us. To understand how photolyase 
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interacts with the excited state FADH-*, we col- 
lected an early time point at 3 ps. Subsequent- 
ly, to characterize CPD lysis, we collected five 
time points between 300 ps and 30 ns. Then, 
to observe product release, we measured four 
time points between 1 and 100 us (Fig. 1B and 
table S1). From the resulting diffraction data, 
we determined that 16 to 22% of the illumi- 


Arg256 


e : 
e adenine 


Asn414 


nated photolyase molecules undergo light- 
activated dynamics (27). By subtracting the 
residual nonactivated signal, we computed 
extrapolated structure factors and used these 
to refine 10 time-resolved structures (27, 30). We 
observed quantitative improvement in the mod- 
els by refining against extrapolated data up to 
2.1 to 2.4.A depending on time point (table $1). 


dark—3ps_ +4.50/-4.50 


Fig. 3. Electronic excitation disrupts local water networks around adenine. (A) Water networks in the active 
site at 3 ps (3 ps, purple and red spheres; dark state, gray and gray spheres). Red dashed lines represent broken 
hydrogen bonds and yellow dashed lines represent retained hydrogen bonds. Arrows highlight key waters that 
become disordered following electronic excitation of FADH-, where Foys.3ps~Fobs,dark Maps (teal, +4.5 6; orange, 
-4.5 o) show a loss of density. In the dark model, the N1A-water distance is 2.4 A, and the N7A-water distance is 


2.8 A. See fig. S19 for the 2MFex4—-DF ajc Map. (B) The time-resolved structure at 3-ps pump-probe delay, showing 


Fobs3ps~Fobs,dark Maps (teal, +4.5 o; orange, -4.5 o), indicating the adenine region of interest shown in (A). 


f dark— 1ns 
=4 polder +60 


Glu301 
Asn257 


Fig. 4. DNA repair proceeds through a one-bond intermediate, resulting in 
geometric mismatch with the active site. (A) Extrapolated polder MF ext;- 

DF calc OMIT maps (blue, +6 o) at 1-ns pump-probe delay provide evidence for 
a one-bond intermediate. Blue, structure and maps for 1-ns delay; gray, dark 
structure with both bonds formed. (B) By 3 ns, repair is largely complete, 

with thymine bases forming a coplanar, x-stacked thymine geometry (3 ns, 
purple; dark state, gray; Fobs3ns-Fobs.dark, teal, +4 0; orange, —4 o). This disrupts 
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However, the resulting maps and models are 
comparable to those determined from non- 
extrapolated data with a highest resolution of 
2.5 to 3.0 A obtained by rotational collection at 
a synchrotron (27). 


Cofactor binding site facilitates dramatic 
butterfly bending of FADH- upon excitation 


Repair chemistry begins with the photoexcita- 
tion of FADH-. To achieve high quantum ef- 
ficiency, electron transfer to the substrate must 
be the fastest de-excitation pathway for the 
excited state. Transient absorption measure- 
ments show that FADH-* in solution exhibits 
three decay modes with time scales of ~10 ps, 
~50 ps, and ~2 ns (37) and relative amplitudes 
of 50 to 60%, 20 to 30%, and 10 to 15%, re- 
spectively. These same decay processes were 
observed in solutions of reduced flavin mono- 
nucleotide (FNMH-), which lacks adenine, 
demonstrating that quenching by adenine is 
not responsible for rapid decay (31). By con- 
trast, similar to other photolyases (15), in the 
absence of substrate, PLmmCPD only shows 
a single ~2-ns decay mode in the visible spec- 
trum (fig. S6). The enzyme channels all excited- 
state population into this pathway, which is 
sufficiently long lived to enable productive 
electron transfer. 

We characterized this pathway with TR-SFX. 
Before laser illumination, the FADH- cofactor 
was in a bent geometry, characteristic of the 
reduced state, with the flanking benzene and 


hydrogen bonding between the 3' thymine and an ordered water previously 
coordinated by Asn257 (red dashed lines, lost H bonds). With this interaction 
disrupted, Arg256 forms a new interaction with the 5' thymine (yellow dashed 
lines, maintained H bonds), weakening the interactions of the 3' thymine with the 
active site. (©) The time-resolved structure at 3-ns pump-probe delay, showing 
Fobs,3ns~Fobs,dark Maps (teal, +4 o; orange, -4 6). The CPD region of interest 
shown in (A) and (B) is indicated. 
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pyrimidine rings forming a +14° “butterfly- 
bending” angle around the N5-N10 axis (Fig. 2A). 
This bend inverted 3 ps after laser excitation 
(Fig. 2B). The sp*-hybridized N5 and N10 cen- 
ters underwent pyramidal inversion, while the 
benzene and pyrimidine rings, distorted from 
their ground-state planar geometries, kinked at 
-23°. Spectroscopy performed by us in the ab- 
sence of substrate (fig. S6) and by others in 
both the presence and absence of substrate 
(7, 15) suggest that FADH-* is the only pop- 
ulated species at 3 ps. Accordingly, we assigned 
the flavin geometry in our 3-ps structure to the 
FADH-* excited state. 

The photolyase FADH- binding site enabled 
36° of butterfly bending upon excitation, while 
simultaneously restricting FADH-* from reach- 
ing molecular geometries conducive to de- 
excitation. Butterfly bending disrupted only 
a single hydrogen bond between Arg378 and 
FAD N5 (2.9 — 4.0 A; Fig. 2A). By contrast, 
all other stabilizing interactions between the 
protein and FAD, including hydrogen bonds 
with Asn403 and Asp409, were maintained 
upon excitation (Fig. 2A). 

Ordered water molecules near the isoallox- 
azine readily accommodated this pronounced 
butterfly bending, rearranging in response to 
excitation and forming a new water network 
within 3 ps (Fig. 2, A and C). Specifically, upon 
excitation, a hydrogen bond between FAD 
carbonyl O2 and a nearby highly coordinated 
water was broken as the water was displaced 
away from the isoalloxazine ring (2.7 > 3.9 A). 
However, a second water strengthened its 
hydrogen-bonding interaction with the same 
carbonyl (3.0 — 2.3 A), simultaneously forming 
a new hydrogen bond with the sidechain of 
Ser268. These two waters were themselves hy- 
drogen bonded to one another both before and 
after their rearrangement. This water toggle 
provides a second mechanism by which the 
FAD binding site is flexible enough to allow ex- 
citation, while simultaneously being able to form 
a hydrogen-bond network in the excited state. 


Electronic excitation disrupts adenine-associated 
water networks 


To investigate why photolyase positions adenine 
between the electron-donating isoalloxazine ring 
and electron-accepting CPD, we interrogated our 
time-resolved structures for dynamics that would 
reveal adenine’s participation in the electron 
transfer reaction. At 3 ps, we observed disrup- 
tion of two water networks in the active site that 
interact strongly with this adenine (Fig. 3). 
The first network consisted of a five-water 
cluster that fills a pocket in the active site of PL 
mmCPD near the 3’ thymine (9). The second 
network contained two adenine-associated wa- 
ters that fill a small void in the protein struc- 
ture on the opposite side of the adenine ring 
(Fig. 3). One water from each group partici- 
pated in a hydrogen bond with adenine atoms 
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Lys439 a 


Fig. 5. Product release and disruption of DNA/photolyase interactions. (A) Overview of the DNA binding 
mode of photolyase before (dark state, gray) and after (100 us, purple) repair. The DNA backbone conformation 
shown is from the structure at 100 us. Interactions that are retained or formed at 100 us are shown in yellow 
dashed lines, whereas interactions that are lost, most notably at the thymine dimer site (Po-Arg441 and P.;- 
Lys451), are shown as red dashed lines. (B to D) Product release. (B) 10 ys, both thymines bound in the active 
site; (C) 30 us, 3' release; (D) 100 us, partial 5' release. After exiting the active site pocket, thymines are only partially 
ordered: the conformations of 3’ thymine at 30 ps and 5’ thymine at 100 ps modeled outside of the pocket are 
partially occupied (figs. S18 and S21). (E) The time-resolved structure at 100-4 pump-probe delay, showing 

F obs,100us~F obs,dark Maps (teal, +4 o; orange, -4 o), indicating the CPD region of interest shown in (B) to (D). 


N7A and NIA in the dark state. At a 3-ps pump- 
probe delay, both of these waters became dis- 
ordered, whereas the adenine ring remained 
effectively stationary (adenine atomic displace- 
ments ~0.2 A) (Fig. 3). This water rearrangement 
propagated through the five-water cluster, dis- 
rupting a third water and inducing a 0.6-A shift 
in the position of Arg256 (Fig. 3). At later time- 
points, a shifting pattern of water density was 
observed (fig. S19), but by 10 ns, concurrent 
with the decay of the excited state FADH-*, 
both waters directly coordinating the adenine 
at N7A and N1A regained order (fig. S19). 

We propose three models that explain the 
notably specific water dynamics observed. First, 
electronic coupling between the adenine and 
isoalloxazine systems in the excited state (32) is 
sufficiently strong to disrupt adenine-water 
hydrogen bonding. Second, a large-difference 
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dipole on the isoalloxazine system resulting 
from electronic excitation perturbs these wa- 
ters through space (33). Third, the adenine 
system is vibrationally hot, and this thermal 
motion disrupts these coordinated waters. In- 
deed, the adenine-ring B factors increase from 
22 to 37 A” between the dark and 3-ps struc- 
tures, versus 21 to 29 A? for the isoalloxazine 
system, which suggests relatively increased dis- 
order of the adenine at 3 ps (table S3). Our 
structure for this time point cannot distin- 
guish between these models but establishes a 
basis upon which to design incisive future 
experiments and calculations that can. 


Repair proceeds through stepwise bond 
breaking of the cyclobutane pyrimidine 


Between 300 ps and 3 ns, we observed cleav- 
age of the two carbon-carbon bonds that formed 
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the DNA lesion. Extrapolated polder mF .,t,- 
DF aie OMIT maps (27, 34) that omit the thymine 
dimer region at 300 ps and 1 ns unambiguously 
show that the reaction proceeds through a 
transiently populated intermediate in which 
the C5-C5’ bond is broken (Fig. 4A, fig. S17, 
and table S2). Whether the two cyclobutane 
bonds break concertedly has been extensively 
debated (25), but here we provide clear struc- 
tural evidence for a single-bond mechanism. 

At 3 ns, the second carbon-carbon bond was 
broken, and the planar aromatic thymine sys- 
tems were restored (Fig. 4B). Between 300 ps 
and 3 ns, the structure of the FAD isoallox- 
azine ring flattened, reaching a bending angle 
of -4° by 3 ns (Fig. 2B), which we assign to the 
FADHe semiquinone state (77). Concomitant- 
ly, the hydrogen bond between the sidechain 
carbonyl of Asn403 and FAD N5 shortened 
(2.9 A at 3 ps > 2.5 A for 300 ps through 3 ns; 
table S3), confirming predictions from quantum 
calculations and IR spectroscopy measurements 
that predicted that this interaction is essential 
to stabilize the semiquinone state (35). 


Thymine dynamics after lysis drive 
product release 


During repair, structural rearrangements of 
the product disrupted the geometric match 
between the bound bases and the relatively 
static active site pocket, ultimately resulting in 
release of the repaired thymines. Immediately 
after lysis, the repaired nucleobases were forced 
apart, rotating from a thymine-thymine angle 
of 43° (dark) to nearly planar (16° 3 ns) (Fig. 
4B). As a result, the volume occupied by the thy- 
mine dimer increased [462 A’, dark > 475 A®, 
3 ns (27)], crowding the active site pocket and, 
most notably, displacing Met379 (Fig. 4B). 

In this geometry, the 5’ thymine m stacks 
against Trp305 and maintains two strong 
hydrogen-bonding interactions with Glu301 
(Fig. 4B). By contrast, rotation of the 3’ thy- 
mine disrupted the most mobile piece of the 
active site, a hinge consisting of Arg256, Asn257, 
and an ordered water, which holds the 3’ thy- 
mine in place prior to repair. A hydrogen bond 
between 3’ thymine N3 and the ordered water 
coordinated by Asn257 was broken by 300 ps 
(2.9 A, dark — 3.9 A, 300 ps), causing the water 
to become mobile and move away from the 
nucleobase (Fig. 4B). The Arg256-Asn257 hinge, 
now uncoordinated, moved toward the 5’ thy- 
mine, exposing the 3’ thymine to solvent and 
providing a clear route for product release. 

Between 10 ns and 1 us, the enzyme-product 
complex underwent only minor structural var- 
iations. The initial steps of product release were 
observed at 10 us, with Met379, which was 
formerly pushed out of the active site, reversing 
direction to move into the active site toward 
the 3’ thymine (fig. S18). By 30 us, this thymine 
flipped out of the pocket completely (Fig. 5B, 
fig. $18, and fig. S21), disrupting protein-DNA 
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salt bridges between Arg411 and Py and Lys451 
and P,,; (Fig. 5A). By 100 us, the 5’ thymine was 
midway through its withdrawal from the active 
site, and both repaired nucleobases exhibited 
substantial disorder. By contrast, the phosphate 
backbone was still coordinated by salt bridges 
(Fig. 5C, fig. S18, and fig. S21). We did not ob- 
serve evidence of a restoration of Watson-Crick 
base pairing in our crystal structures, either be- 
cause such rearrangements occurred on lon- 
ger time scales or were not compatible with 
the crystal lattice. 

Our results indicate that prior to repair, the 
thymine dimer, with two ring systems strongly 
angled with respect to one another, was com- 
plementary with the active site geometry. By 
contrast, the repaired thymines, which adopt a 
planar n-stacked conformation, occupied a 
much larger volume in the active site and 
could not form the same hydrogen-bonding 
interactions at the 3’ base, most notably with 
the Arg256-Asn257-water hinge system. Accord- 
ingly, thermal motion was sufficient to initiate 
release of the 3’ base after tens of microsec- 
onds, followed by the 5’ base hundreds of micro- 
seconds later (fig. S21). Owing to dynamics of 
the bound DNA with respect to a largely static 
enzyme, transformation into product both 
swells the active site and breaks key hydro- 
gen bonds, driving product release. 


Conclusions 


We have determined ten time-resolved struc- 
tures of photolyase in the act of DNA repair. 
The observations of a highly bent excited state 
and single-bond thymine dimer define the key 
intermediates along the reaction pathway. 
These structures provide a foundation to un- 
derstand the chemical mechanism of photo- 
lyase. Rearrangements in the DNA, in contrast 
with modest structural changes in the enzyme 
itself, drive the rate-limiting step of the cat- 
alytic cycle, which is product release. Together, 
our structures illuminate the function of a 
powerful DNA repair system used by nearly all 
lifeforms to survive and thrive under the sun. 
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Colossal electrocaloric effect in an 
interface-augmented ferroelectric polymer 


Shanyu Zheng’, Feihong Du’, Lirong Zheng’, Donglin Han’, Qiang Li’, Junye Shi’, Jiangping Chen’, 


Xiaoming Shi*, Houbing Huang’, Yaorong Luo‘, Yu 
Nicolas de Souza’, Liang Hong”, Xiaoshi Qian*®* 


rong Yang’, Padraic O’Reilly®, Linlin Wei°, 


The electrocaloric effect demands the maximized degree of freedom (DOF) of polar domains and the 
lowest energy barrier to facilitate the transition of polarization. However, optimization of the DOF and 


energy barrier—including domain size, crystallinity, 


multiconformation coexistence, polar correlation, 


and other factors in bulk ferroelectrics—has reached a limit. We used organic crystal dimethylhexynediol 


(DMHD) as a three-dimensional sacrificial master to 


assemble polar conformations at the heterogeneous 


interface in poly(vinylidene fluoride)-based terpolymer. DMHD was evaporated, and the epitaxy-like 
process induced an ultrafinely distributed, multiconformation-coexisting polar interface exhibiting a 
giant conformational entropy. Under a low electric field, the interface-augmented terpolymer had a high 
entropy change of 100 J/(kg:K). This interface polarization strategy is generally applicable to dielectric 
capacitors, supercapacitors, and other related applications. 


hifts in the climate, along with well- 

publicized energy shortages, call for the 

development of more energy-efficient, 

point-of-care cooling and heating tech- 

nologies. Electrocaloric effect (ECE)- 
based refrigeration uses solid-state materials 
as refrigerants in an electric-capacitive man- 
ner through efficient charge-discharge cycling, 
leading to a potentially more environmentally 
friendly cooling alternative with zero global 
warming potential and low indirect CO, emis- 
sions (1, 2). Compared with many other prom- 
ising alternatives, EC refrigeration (ECR) uses 
electricity directly without other heavy acces- 
sories, such as magnets, actuators, or compres- 
sors (3-5). ECE refrigeration could potentially 
be accessible for lightweight thermal man- 
agement of localized environments, portable 
electronics, and other wearables, in addition 
to large-scale applications (6-8). Consider- 
ing practical issues such as electrical stability 
and power consumption, reducing the re- 
quired electric field to induce a large ECE is 
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the major challenge for commercialization 
of ECR. 

The ECE comes from the difference between 
two polar entropy states that are linked by an 
electric field. The two states should ideally have 
a large entropy difference but a small energy 
barrier to overcome. Therefore, relaxor ferro- 
electrics are currently dominant in EC research 
because of the large polarization offered by 
entropy changes. To enhance the entropy change 
over a wide temperature window, these ferro- 
electrics have been modified by many meth- 
ods, such as defect modification in ferroelectric 
polymers, multiphase coexistence in ceram- 
ics, and supercritical transitions (9-11). For ex- 
ample, the intrinsic ECE in the well-studied 
poly(vinylidene fluoride-trifluoroethylene- 
chlorofluoroethylene) [P(VDF-TrFE-CFE)] was 
greatly enhanced by increasing the overall crys- 
tallinity but suppressing the formation of large 
crystals. The resulting polar high-entropy fer- 
roelectric polymer exhibited a giant ECE when 
the crystallite size was reduced from 50 to 20 nm 
(72). Similar EC enhancement can be found in 
the high-energy electron-irradiated P(VDF- 
TrFE) copolymers when they are transitioned 
from the normal ferroelectric to relaxor ferro- 
electrics, because the radiation reduces the 
size of the polar domain and introduces multi- 
conformation coexistence (13-15). Therefore, 
further reducing the size of polar entities to 
the subnanometer scale seems to be a sound 
strategy. 

However, despite the fruitful hypothetical 
outcomes (verified by Landau theory, fig. S1), 
further reducing the crystallite size to a sub- 
nanometer scale is extremely challenging be- 
cause of the complex polymer-crystallization 
processes involved (16). In addition to reducing 
the size of the three-dimensional (3D) crystal- 
lites, our goal is to introduce 2D subnanometer- 
scale, porous vacancies into the terpolymer 
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(TP) and impose polar B-like conformatioy ce 
the polymer interface around the micropL--2 
(77, 18), hence constructing 3D polar interfaces 
with a large surface area and a considerably 
enhanced polar entropy. 

We mixed the organic crystal dimethyl- 
hexynediol (DMHD) (19)—which is highly 
miscible with the based TP and exhibits a 
low boiling temperature—with the TP as an 
epitaxial subnanometer-scale 3D master to 
induce self-assembly of polar conformations 
at the heterogeneous interface (20, 21). We 
later evaporated DMHD from the nanocom- 
posites to prevent complications, which occur 
in many nanocomposites. The escape of the 
DMHD crystallites left subnanometer-scale 
pores, and we observed a polar-enhanced in- 
terface (Fig. 1A) when we used two distinct 
atomic force microscopy-infrared (AFM-IR) 
spectroscopies. The interface-augmented micro- 
porous polymer exhibited a fourfold ECE com- 
pared with that of the base TP. Under a low 
electric field equal to 20% of the breakdown 
field, Ep, the polymer exhibited entropy changes 
of ~100 J/(kg-K) and EC strengths of more 
than 1 J/(kg-K-MV) (MV, megavolts). We con- 
ducted structural and dielectric analyses to ‘ 
probe the mechanism of the EC enhancement, 
which we corroborated using phase-field anal- 
ysis and Landau theory. We conducted density 
functional theory (DFT) and molecular dy- + 
namics (MD) simulations to understand the 
self-assembly of the interfacial B-like confor- 
mations on the molecular scale. The resulting 
interface-augmented TP exhibited a refrigeration 
capacity (RC) of 5 x 10° J/kg and maintained 
stable operation over 3 million cycles. 


EC enhancement in the 
interface-augmented polymer 


Interfacial engineering is widely used for met- 
als, semiconductor materials, and dielectric ‘ 
insulators to manipulate key features such as 
mechanical properties, electronic bandgaps, 
and internal fields (22-25). For example, epi- 
taxial growth has been widely used in elec- 
tronic materials to manipulate the behavior 
of the carriers (26). Modifying the chemistry at 
the heterogeneous interfaces of soft materials 
has proved effective in designing ultratough 
hydrogels, strong bioadhesives, and materials 
with a wide range of wettability (27-29). The 
ECE could also be enhanced by incorporating 
various nanoparticles into a polymer matrix 
(30, 31). However, whether the enhancement 
is induced by nanofillers or interfaces has never 
been clearly demonstrated, and the perma- 
nently added nanofillers could cause practical 
complications, including large size agglomer- 
ation, reduced Eg, and quality issues in mass 
production (32-34). 

Incorporation of the sacrificial DMHD pro- 
vides an opportunity to study the intrinsic 
ECE in a filler-free polymer whose interfaces 
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have been physically created and structurally 
modified. We performed ab initio molecular 
dynamics (AIMD) simulations to explore the 
interaction between DMHD and the TP chain. 
The results intuitively indicated that the non- 
polar o-phase would spontaneously evolve to 
the polar B-like phase when DMHD was intro- 
duced into the system (fig. S2A). We hypothe- 
sized that the induced f-like phase remained 
in the polymer matrix and provided substan- 
tially increased interfacial polar entities for 
potential EC enhancement after DMHD was 
evaporated in the vacuum annealing process 
(Fig. 1A and figs. S3 and S4). Compared with 
TP, the fully evaporated DMHD [confirmed with 
proton nuclear magnetic resonance (H-NMR, 
fig. S5) and energy-dispersive x-ray spectros- 
copy (EDS, fig. S6)] left many micropores with 
diameters <3 nm (Fig. 1B and fig. S7) (35, 36). 
The fabricated film was as optically transpar- 
ent as TP without observable phase separation, 
confirming the small size and good disper- 
sion of the pores in the polymer matrix (fig. 
S5, C and D). The formation and escape of 
the DMHD nanocrystals leaves fine textures 
that converge into wider channels, creating 
erosional landforms in the polymer matrix 
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Fig. 1. The interface-augmented TP induced by DMHD exhibits colossal 
ECE. (A) Schematic diagram of a polyol molecule (DMHD) generating a B-like 
polar conformation from a nonpolar configuration at the nanopore interface. 
(B) The pore diameter distributions of TP and TPD were obtained on the basis of 
BET results. (©) TPD-1% exhibits ~300% enhanced entropy changes compared 
with the base TP sample. The inset shows the verified ECE results of TPD-1% 
obtained with IR camera measurement at 100 MV/m. Sample quantities n = 5, 
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(37), which were not observed in the base TP 
[cryo-electron microscopy (cryo-EM) image 
in fig. S8]. 

We were surprised to find that the DMHD- 
modified TP (TPD) exhibited a markedly im- 
proved ECE. We observed an entropy change 
of 100 J/(Kkg-K) under 100 MV/m, which cor- 
responds to an adiabatic temperature >20 K 
(Fig. 1C). We conducted a direct temperature 
measurement under the same field (figs. S12 and 
S13) using an IR camera and recorded a tem- 
perature change of 17.6 K (Fig. 1C, inset, and 
movie S1). Neat polymers generally do not exhibit 
a temperature-specific EC strength (A7/AF) 
reaching 0.2 K/MV, which is comparable with the 
best EC ceramic (when AT > 5 K), PbSco5Tap503 
(38). The values of the ECE and the EC strength 
(AS/AE) were the highest among the previous- 
ly reported polymers (Fig. 1D) (supplementary 
text, section 2.2). We found that the optimal con- 
tent of DMHD was 1 wt.% (denoted as TPD-1%) 
and that further increasing the DMHD con- 
tent caused the formation of large crystals that 
reduced the ECE (fig. S10, D to F). For practical 
application, we proposed a figure of merit—EC 
material quality factor (EMQ)—for the neat EC 
polymers, which includes not only the entropy 
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points are centered on the mean, and the bars indicate SD. (D) Comparison of 
the electrocaloric parameter AS/AE — AS among TPD-1% and other EC materials, 
where AS is the entropy change and AS/AE is the field-dependent entropy 
change. (E to J) Simultaneously measured topography of TP (E) and TPD (H), 
corresponding AFM-IR chemical map [(F) and ()] under irradiation with a 

1288 cm” laser equipped with a nanolR3 system (Bruker), and AFM height and 
IR amplitude profiles versus position [(G) and (J)]. 


and temperature changes but also mechanical 
properties, thermal conductivity, lifetime, and 
other factors (supplementary text, section 2.4). 


We used IR photoinduced force microscopy « 
(IR-PiFM, Bruker Anasys nanoIR3) to provide ‘ 


a direct mapping of the polar conformational 
distribution [more detail is provided in (39)]. 
The IR-PiFM provided nanoscale IR chemical 
maps by combining wavelength-tunable lasers 
with AFM to locally detect microcosmic nano- 
structures (40, 41). We chose 1288 cm to probe 
the samples, which corresponds to the all-trans 
conformation of PVDF (42). We show the all-trans 
chemical maps of TP and TPD-1% in Figs. 1F 
and 1, and we display the simultaneously mea- 
sured topographies in Figs. 1E and 1H, respective- 
ly. The chemical patterns of TP and TPD-1% were 
substantially different, and the all-trans confor- 
mations were more evenly distributed in TPD-1% 
than in TP, indicating a higher level of polar ran- 
domness. Moreover, further examination showed 
us that the polar conformations were not formed 
on, but in between, the polymeric crystals. The IR 
peak signal exhibited a strong correlation with 
the location of crystalline domains in TP, and the 
peaks and valleys of the two signals were posi- 
tively correlated (Fig. 1G). By contrast, the peaks 
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Fig. 2. IR-PiFM characterization of interface-augmented polar and nonpolar 
conformations. (A to D) IR-PiFM (VistaScope) measurement from PCA-MCR 
analysis: (A) and (C) show topography of TP and TPD, and (B) and (D) show 
corresponding PCA-MCR chemical composition diagram extracted from hyPIR 
results; the red pattern represents the y phase of PVDF, and green and blue 
represent the a phase and B-like phase, respectively. (E) Normalized photoinduced 


in the IR map of TPD-1% were strongly corre- 
lated with the localized valleys in the height map 
(Fig. 1J), indicating that the polar phases at the 
interface of crystalline domains were stronger 
than those on the polymer crystals (figs. S15 and 
$16). Although the resolution is not high enough 
to directly image each subnanoscale polar in- 
terface, these IR-PIFM mappings demonstrated 
the high configurational entropy state of the 
substantially enhanced polar conformations 
between the polymeric crystals. 


Interfacial polar and nonpolar conformations 


To investigate the polar high-entropy state in the 
interface-augmented TPD-1% in addition to 
the all-trans conformation, we collected chemical 
images in the range of 1600 to 780 cm with a 
spectral resolution of 5 cm” using IR-PiFM 
(Molecular Vista, Vista One). Hyperspectral 
PiFM IR (hyPIR) acquires a spectrum at every 
pixel along with the topography, which allowed 
us to select any wave number within the range 
used (movies S2 and $3). Examining the wave- 
number range, the mapping can be categorized 
into three distinct patterns, which can be as- 
signed to the y-like (red), a-like (green), and 
B-like (blue) conformations (supplementary text, 
section 3.3). We automatically extracted im- 
ages for each phase and integrated using prin- 
cipal components analysis and multivariate 
curve resolution (PCA-MCR) technology (43-45), 
which allowed us to directly visualize the dis- 
tribution of the three major conformations 
(Fig. 2, A to D, and figs. S22 and $23). 
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Compared with the rice-like crystals in TP, 
TPD-1% exhibited finer and longer polymer crys- 
talline structures in the topographies (Fig. 2, 
A and B), indicating that the DMHD nanocrys- 
tals not only induced the polar conformations 
at their interface but also modulated the growth 
of the lateral crystal growth front of polymer 
lamellar crystals. More importantly, the PCA- 
MCR- integrated composite IR image showed 
that the B-like conformations were present in 
more areas at the surface than was TP (Fig. 2, C 
and D), which confirmed the recorded IR-PiFM 
images (Fig. 11), and directly visualized multi- 
phase coexistence in the interface-augmented 
TPD polymer suggested the concurrent achieve- 
ment of higher polar entropy and larger crys- 
tallinity than were found in TP. 

To provide a quantitative comparison, we in- 
tegrated the signals of a-like and -like confor- 
mations in the PCA-MCR map (full spectrum) 
of TP and TPD-1%. Whereas the normalized 
photoinduced force of the B-like phase was only 
0.26 times that of the a-like phase for the base 
TP, the ratio increased to 8.85 for TPD-1% (Fig. 
2E) (46). We also extracted the chemical maps 
for characteristic absorptions from the hyPIR 
animation (Fig. 2F for TPD-1% and Fig. 2G for TP 
at 1288 cm). The normalized full IR spectrum 
of TPD-1% showed a distinct peak at 1288 cm“, 
compared with TP, which suggested the en- 
hanced emergence of all-trans conformations 
(34) for TPD-1% (Fig. 2H and figs. S17 and S18). 
Although we may not directly compare the sig- 
nal intensities between the chemical maps be- 
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force of nonpolar and polar phases extracted from the component distribution in 
PCA-MCR analysis. (F and G) Normalized chemical maps of TP and TPD irradiated 
with a 1288 cm laser collected from the hyPIR results. TPD-1% exhibited a superior 
polarity response. (H) Normalized full IR spectrum of the corresponding AFM-IR 
chemical map [(F) TP, (G) TPD-1%]; TPD-1% has a distinct broad peak near 1288 cmv}. 
(I) Normalized photoinduced force at local positions marked in (F) and (G). 
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cause they were measured separately, the signal 
intensity of polar conformations was much 
higher than that of the nonpolar conformations 
at a randomly selected local point (Figs. 2F and 
2G, red cross symbols) on the surface of TPD- 
1%, whereas the -like signal was merely 10% 
compared with the o-like conformation in TP 
(Fig. 21). The sharply enhanced polar structures 
of TPD-1% suggested a more sensitive response 
to the electric field and hence a stronger ECE 
(supplementary text, section 3.5). 


Structural analysis of the TPD polymers 


To assess the role of DMHD crystals before and 
after their evaporation, we conducted several 
structural analyses of TP, TPD-1%, and unan- 
nealed TPD-1% (denoted as TPD-1%-un), in which 
the composite was not annealed in vacuum at 
120°C, rendering a low-crystallinity polymer 
matrix. Differential scanning calorimetry (DSC) 
was conducted, showing that the melting en- 
thalpy of TPD-1% was 24 J/g, compared with 
22 J/g for TP and 19.7 J/g for TPD-1%-un (Fig. 3A 
and fig. S29). The slight enhancement of crys- 
tallinity was further verified by wide-angle x-ray 
diffraction (WAXD) (Fig. 3B), and the crystal- 
linity of TPD-1% was marginally increased from 
29 (TP) to 33%. The enhancement of crystallinity 
in the bulk polymer was minimal compared 
with the strong all-trans conformations that we 
observed on the surface, suggesting that most 
of the emerged polar entities were indeed con- 
fined at the interfaces. From the DSC results, we 
noted that the polar-nonpolar transitions for 
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Fig. 3. Structural properties of the modified EC polymers. (A) DSC first- 
heating thermograms; the inset shows the integrated enthalpy of crystallization. 
(B) WAXD patterns of TP, TPD-1%, and TPD-1%-un without an electric field; 
the inset shows the crystal size data. (C) In situ WAXD patterns of TP, TPD-1%, 


TP, TPD-1%, and TPD-1%-un were located at 
38°, 30°, and 29°C, respectively. The reduced 
transition temperatures after incorporating 
DMHD indicated that the average crystallite 
size was reduced, which was also confirmed 
with WAXD (Fig. 3B and table S2) and the 
prior AFM results. 

To verify the origin of the colossal ECE (in- 
terfacial or bulk effect), we conducted in situ 
WAXD to investigate the dynamic transition 
of the 3D bulk crystalline structures under elec- 
tric fields (Fig. 3C and figs. S30 to $32). The 
presence of DMHD crystals in TP (TPD-1%-un) 
stimulated a strong phase transition compared 
with the transition in neat TP. The nature of 
the DMHD-induced transition in the bulk do- 
mains is to some extent interesting because it 
could generate a very large ECE if occurring 
without a large dielectric loss (fig. S30C). We 
constructed a DFT framework to qualitative- 
ly assess the contribution of DMHD to the 
nonpolar-polar transition (12). For simplic- 
ity, we considered only dihedral angles of the 
VDF monomers with and without DMHD, and 
DMHD alone reduced the energy barrier by 
~38% (Fig. 3D). The established interfacial 
hydrogen bonds between the OH groups on 
DMHD and -CF, moieties on PVDF could be 
responsible for the induced interfacial all-trans 
configuration (fig. S33) (47). 

After the annealing and evaporation of DMHD, 
the transition enhancement partially remained; 
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under 100 MV/m, the field-induced polar frac- 
tion of bulk domains increased from 26 (TP) 
to 37% (TPD-1%). This indicates that the po- 
rous polymer matrix formed by the once-existing 
DMHbD interfacial crystals still exhibited a lower 
energy barrier for the polar and nonpolar phase 
transition on the bulk crystal (Fig. 3E). The 
electromechanical transverse strain of the TPD 
sample generated 300% (S; = 3%) enhance- 
ment compared with that of the reported TP 
(S; = 1%) (48), owing to the lowered barrier for 
phase transition (fig. S34). The mean square 
displacements (MSDs) of TPD-1%-un were the 
smallest among those of TP and TPD-1% over 
the entire temperature range from 200 to 350 K 
(Fig. 3F and fig. S35). The reduced atomic fluc- 
tuation indicated that the hydrogen atoms were 
closely confined adjacent to the DMHD sur- 
face, which could explain the mediocre ECE in 
TPD-1%-un (fig. S1OD). By sacrificing the DMHD, 
TPD-1% exhibited an MSD similar to that of TP, 
owing to the release of the interfacial confine- 
ment of B-like conformations with a large sur- 
face area, which substantially contributed to a 
colossal ECE in TPD-1%. 


Dielectric analysis and long-term operation 


Our structural analysis suggested that the in- 
terfacial (2D) all-trans conformations are like- 
ly more effective in generating the ECE than 
is bulk (3D) crystallization. Ideally, the ECE 
can be described by Landau phenomenological 
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and TPD-1%-un under 100 MV/m. (D) Energetics along the transformation 
pathway from the a to B phase obtained with DFT for pure PVDF and DMHD- 
modified PVDF. (E) Fraction of polar phase changes in the series of TPD 
samples. (F) MSD of TP and unannealed and annealed TPD-1% samples. 


theory, 
S =—¥B[P?(En) - P2(Ex)| (1) 


where £, is the applied high electric field, Fy, 
is the low electric field, and where FE, = 0 and 
S=—YVpP?(Ey), with B = In(Q)/(e98). eo is 
the permittivity of vacuum, and 0 is an effec- 
tive Curie constant, which is directly related 
to the polar correlation in dielectrics. A large 
Q corresponds to numerous polar entities 
accessible by dipoles, and a weak polar corre- 
lation, 9, will result in a large B and high- 
efficiency EC material (12). The dimensionally 
reduced 2D polar interfaces should exhibit a 
giant B coefficient because of their substantially 
reduced size and increased surface area com- 
pared with those of the 3D polar domains. 
Therefore, the ECE in the interface-augmented 
TPD sample could be divided into two parts: 


ASpc = ASge* + ASEc" (2) 


where ASU refers to the EC contribution of 
the 3D polar domains, and AS$% refers to the 
2D polar interfaces. 

We applied the time-dependent Landau- 
Ginzburg-Devonshire thermodynamic model 
assisted by phase-field simulation to simu- 
late our experimental ECE (Fig. 4A and figs. 
S1 and S36) to provide a quantitative under- 
standing. We constructed the distributions of 
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the polar domains in TP and TPD-1% by con- 
sidering the WAXD and Brunauer-Emmett- 
Teller (BET) experimental results (Fig. 4A, 
inset). Owing to the large surface area intro- 
duced by the nanopores, the polar interfaces 
contributed roughly 10% of the total polar 
entities (fig. S37). Our model provided EC- 
induced entropy changes of TP and TPD-1% 
that are in good agreement with our exper- 
imental results (Fig. 4A). 

We corroborated the interface-augmented 
ECE by analyzing the dielectric properties. As 
essentially an air-incorporated nanocomposite, 
the volumetric polarization and permittivity 
are expected to be reduced, as has been shown 


in many porous materials used to design low-k 
dielectrics (49). The bipolar polarization-electric 
field loops (P-E loops) showed that the polariza- 
tion of TPD-1% was enhanced from 0.048 C/m? 
for TP to 0.058 C/m?, a 20% increase (Fig. 4B 
and fig. S38). Because of the subnanoscale air 
pores and the polar interfaces with a large po- 
lar surface area, TPD-1% exhibited high per- 
mittivity (Fig. 4C) and low dielectric loss (fig. 
$39). The dielectric strength was also increased 
from 350 MV/m for TP to 500 MV/m (fig. S46), 
which is highly desired in the actual cyclic op- 
eration of EC refrigeration. 

The sacrificial DMHD-created neat polymer re- 
tains impressive dielectric properties as a former 
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nanocomposite but is free from the complications 
dosses) introduced by the permanent fillers. We 
observed a series of temperature-independent 
peaks located at approximately 37°C, which 
supports the existence of highly localized polar 
entities (Fig. 4C). Removal of the DMHD nano- 
crystals exposed these polar interfaces to smaller 
free volumes and freed them of physical con- 
straints, which markedly improved the f co- 
efficient in TPD-1% (Fig. 4D, inset), which was 
2.7-fold higher than that of TP; this trend was 
consistent with the thermodynamic relation 
AT sat = a (50-52) (supplementary text, sec- 
tion 5.5), where & is the Boltzmann constant, 
v is the volume, and Cg is the volume specific 
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Fig. 4. Dielectric properties and long-term operation of TPD-1%. (A) Experimental 
and phase-field simulation (PFS) ECE of TP and TPD. The inset shows the spatial 
distribution of the polar crystalline phases of TPD-1% at zero and 100 MV/m, which 
was generated in the phase-field model. (B) Experimental P-E loops of TP and 
TPD-1%. (C) Temperature- and frequency-dependent permittivity and loss of TPD-1%. 
(D) The ratio of the B coefficient of nanocomposite-modified PVDF-based polymeric 
EC materials to that of their respective pristine TPs. The inset presents the B 
coefficient values of TP and TPD-1% in this work. (E) EC-induced entropy 
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changes as a function of temperature under electric fields of 50, 70, and 100 MV/m. 
(F) Comparison of the RC (=AS x Twindow, Twindow 280% ATmax) with that 

of other reported bulk ceramics and PVDF-based polymers. (G) Material COP and 
thermodynamic perfections of TP and TPD-1%. (H and 1) Material impact on the 
device performance of TPD-1% tested initially and after 3 million cycles. (H) shows 
the field-dependent ECE, and (I) shows the summarization of the crystallinity 
and crystal size after every 1 million cycles; the inset of (|) shows the x-ray diffraction 
data of TPD-1% before and after 3 million cycles. 
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heat. By contrast, most EC nanocomposites (NCs) 
exhibited a £8 coefficient similar to that of their 
base TP (BT); the Byc/Bpr ratio was ~1, indi- 
cating that the ECE enhancement was primarily 
due to the extrinsically increased polarization 
at the interface (Fig. 4D and fig. S40). We noted 
that the high-energy electron-irradiated P(VDF- 
TrFE) copolymer (at a relatively low dose of 
20 Mrads) exhibited a great B coefficient as 
well (13), which is attributed to the manipula- 
tion of the polar domain size, crosslinking den- 
sity, crystallinity, and the multiconformation 
coexistence (supplementary text, section 5.6). 
The colossal ECE of TPD-1% is temperature- 
independent near room temperature, covering 
an effective temperature window from 10° to 
70°C, owing to the low dipole correlation en- 
abled by the nanoscale distribution of polar 
entities (Fig. 4E). The temperature stability of 
the ECE is advantageous in EC cooling devices 
that operate under active electrocaloric regen- 
eration to expand the temperature window of 
the device (window) Without cascading working 
materials with narrow ranges of a large ECE 
(53). As a result, the RC of TPD-1% is also large 
compared with previous reports from other ma- 
terials (Fig. 4F). To evaluate the efficiency of 
the EC material, we evaluated the material 
coefficient of performance (COP nat) by con- 
sidering the irreversible electric energy loss 
under the same working condition of AT = 
Twindow = 10K (detailed circuit design and cal- 
culation procedure provided in figs. S47 and 
S48) (54). Compared with the base TP (COP nat 
is 10.9), the DMHD-modified TPD-1% exhib- 
ited excellent refrigeration efficiency of 91% 
(COP mat is 27.4), considering a 100% charge 
recovery ratio (Fig. 4G), which has been achieved 
in a custom-made charge recovery circuit for 
EC technology (55). By operating under an 
ultralow electric field, the TPD-1% offered a 
greatly improved COP, that could further 
reduce the size and weight of the power source 
for a potential portable EC cooling device. 
TPD-1% exhibited impressive long-term sta- 
bility over 3 million cycles under 50 MV/m 
electric fields (<20% of Eg) and in a laboratory 
environment without special conditions. After 
3 million continuous cycles, TPD-1% still offered 
an entropy increase (cooling) of 77 J/(kg:-K) 
at 100 MV/m (Fig. 4H). The ECE of TPD-1% 
reached its steady state after 1 million cycles 
and remained stable for the rest-fatigue eval- 
uation. The reduction in EC performance in 
the first 1 million cycles could be explained by 
field-induced damage that was self-healed but 
permanently reduced the crystallinity (Fig. 41). 
After self-healing from the damage at the weak 
links, ECE was stabilized in the subsequent 
cycles (fig. S49). TPD-1% presents the largest 
EC-induced entropy change over the longest 


Zheng et al., Science 382, 1020-1026 (2023) 


cycling lifetime (>70 days) by far (72, 56), making 
it a good candidate for practical EC devices. 

By incorporating extrinsic, sacrificial organic 
crystals, we successfully created internal polar 
interfaces embedded in EC polymers, which 
exhibit a colossal ECE with a long cycling life- 
time. Using multiple experimental and theoret- 
ical tools, we directly observed the augmentation 
of the all-trans conformations that formed a 
highly disordered structure and demonstrated 
the key role that these interfaces played in induc- 
ing the colossal ECE. The interface-augmented 
polar entities reached the characteristic size at 
which 3D bulk crystals can hardly be achieved 
during conventional polymeric crystallization. 
By having a large surface area and giant con- 
figuration entropy, the polar interfaces con- 
tribute more efficiently to the total EC entropy 
change than does the bulk crystal. Dimensionally 
reducing 10% of the total crystalline phase to 
2D stimulated the EC enhancement. Exploring 
the 2D polar structures in a dielectric may 
prove fruitful in enhancing the ECE in ferro- 
electric polymers and may pave the way to a 
transition of EC research similar to that from 
dielectric capacitors to supercapacitors. 
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SLEEP 


Nesting chinstrap penguins accrue large quantities 
of sleep through seconds-long microsleeps 


P.-A. Libourel’*+, W. Y. Lee”*t, I. Achin’, H. Chung’, J. Kim?, B. Massot*, N. C. Rattenborg® 


Microsleeps, the seconds-long interruptions of wakefulness by eye closure and sleep-related brain 
activity, are dangerous when driving and might be too short to provide the restorative functions of sleep. 
If microsleeps do fulfill sleep functions, then animals faced with a continuous need for vigilance 

might resort to this sleep strategy. We investigated electroencephalographically defined sleep in wild 
chinstrap penguins, at sea and while nesting in Antarctica, constantly exposed to an egg predator 

and aggression from other penguins. The penguins nodded off >10,000 times per day, engaging in bouts 
of bihemispheric and unihemispheric slow-wave sleep lasting on average only 4 seconds, but resulting 
in the accumulation of >11 hours of sleep for each hemisphere. The investment in microsleeps by 
successfully breeding penguins suggests that the benefits of sleep can accrue incrementally. 


n animal’s ability to engage adaptively 
with the environment during wakeful- 
ness depends on sleep, a state of envi- 
ronmental disengagement thought to 
perform restorative functions for the 
brain (/, 2). As the time spent awake increases, 
so does the homeostatically regulated pres- 
sure to fall asleep (3). In humans, insufficient 
sleep, common in our 24/7 societies, leads to 
nodding off, the seconds-long interruption of 
wakefulness by eye closure, sleep-related electro- 
encephalogram (EEG) activity (4), and deac- 
tivation of brain networks involved in arousal 
(5). Such microsleeps can be maladaptive, es- 
pecially when nodding off occurs while driving 
a motor vehicle (6). Even when microsleeps 
pose no threat, it is unclear whether they are 
long enough to provide any of the benefits of 
sleep (7). If microsleeps are more than failed 
attempts to initiate sleep and do fulfill sleep 
functions, then relying on microsleeps might 
be an adaptive strategy under ecological cir- 
cumstances that require constant vigilance. 
The reduction in environmental awareness 
that defines sleep renders animals vulnerable 
to predation. Although animals can dilute this 
risk by sleeping in groups (8), the benefit is 
greatest for those in the center, farthest from 
approaching predators (9). Indeed, mallards 
(Anas platyrhynchos) switch from sleeping with 
both eyes closed and both cerebral hemispheres 
[bihemispheric slow-wave sleep (BSWS)] when 
safely flanked by other birds to sleeping uni- 
hemispherically, with one eye open and the 
contralateral hemisphere awake, when exposed 
at the edge of a group (J0). As sleeping at the 
edge is risky and results in lower quality uni- 
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hemispheric slow-wave sleep (USWS), birds 
likely compete to obtain and defend a central 
position within the group, especially when nest- 
ing in colonies. However, in colonial birds, such 
as penguins, intraspecific aggression from 
neighbors and disturbance from birds walk- 
ing through the colony might have a negative 
impact on sleep (11, 12). Given the threat from 
outside and the hustle and bustle within the 
colony, it is unclear whether nesting in the cen- 
ter of a colony leads to better sleep quantity and 
quality. 

We investigated sleep in chinstrap penguins 
(Pygoscelis antarcticus) nesting in a colony 
exposed to a predatory bird, the brown skua 
(Stercorarius antarcticus), on King George 
Island, Antarctica. During incubation, skuas 
are known to prey on penguin eggs mainly on 
the border of the colony (73). As one penguin 
parent must therefore guard the eggs or small 
chicks continuously while its partner is away 
on foraging trips lasting several days, they face 
the challenge of needing to sleep while pro- 
tecting their offspring (movie S1). In addition, 
they also have to effectively defend their nest 
site from intruding penguins. 

We examined sleep in 14 penguins incubating 
eggs in a colony during early December 2019 
(data S1). Using data loggers (Fig. 1A), we mea- 
sured sleep-related EEG activity from both 
cerebral hemispheres (Fig. 1B), the electro- 
myogram (EMG) from the neck muscles, body 
movements and posture with accelerometry, 
location with GPS (Fig. 1C), and diving with a 
pressure sensor. BSWS and USWS were auto- 
matically scored using hierarchical clustering 
and individual thresholding on the EEG (Fig. 
1, D to G, and data S2 and S3). In a subset of 
birds, video recordings were obtained at the 
nest to record sleep behavior. Bouts of rapid 
eye movement (REM) sleep, characterized by 
wake-like EEG activity, eye closure, and drop- 
ping of the head were observed in the video 
recordings (movie S2). However, as in other 
birds, including penguins (14), the EMG did 
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not reliably decrease during REM sleep (my cree 
$2), and REM sleep could only be distinguis-—— 
from wakefulness when a bird was in the field 
of view of the camera. Consequently, we focused 
on slow-wave sleep (SWS), the predominant 
type of sleep in birds (15), including penguins 
(14, 16, 17). Accelerometry was used to identify 
when the birds were lying or standing (Fig. 1H) 
and to identify periods of activity or inactivity 
(Figs. 11 and 2A and data S4). 

The penguins exhibited normal nesting be- 
havior, with the parents taking turns incu- 
bating for 22.06 + 14.72 hours (range: 5.52 to 
64.3 hours) and foraging at sea (Fig. 2A and 
see the materials and methods in the supple- 
mentary materials). Penguins undertook one 
to nine foraging trips, on average reaching a 
distance of 34.01 + 39.66 km (range: 6.26 to 
129.68 km) from the colony and lasting 9.43 + 


Fig. 1. Recording behavioral states in wild chin- 
strap penguins. (A) Penguin equipped with sleep and 
GPS loggers mounted on the back and a pressure 
logger mounted on a leg (not shown). (B) Computed 
tomography images showing the position of EEG 

(black dots) and reference (red dots) electrodes. The ‘ 
top image shows the surface of the skull, and the 
bottom image shows the underlying endocast of the 
brain (pink), revealing a bulge (Wulst) on each 
hemisphere corresponding to the hyperpallium ; 
(H; primary visual cortex) (21) outlined by a black oval 
for the right hemisphere. Each hyperpallial electrode 
was referenced to the cerebellum (Cb) electrodes. 

(C) GPS tracks showing foraging trips from the colony 
(red arrow) on King George Island. (D) EEG example 
showing typical amplitude and frequency ranges. 

(E) Dendrogram (left) and correlation map (center) 
obtained from the hierarchical clustering computed on 
3600 1-s EEG epochs randomly chosen during 
inactivity over 10 days for one hemisphere. On the ‘ 
right are the median raw power spectral density c 
(PSD, top) and PSD normalized (nPSD) by the median 
PSD (bottom) of the two clusters, representing the 

two main EEG patterns occurring during inactivity 

[quiet wake (QW)/REM, cluster 1, red; SWS, cluster 2, 
blue]. (F) Density for the two 2- to 12-Hz EEG power 
clusters identified in (E) showing the intersec 
(black line) used to separate QW/REM (red) from 
SWS (blue). (G@) EEG recordings from both hemi- 
spheres showing the resulting automatic coding of 
SWS (blue) and QW/REM (red), along with the 

EMG and three axes (XYZ) of accelerometry 
recorded by the back-mounted sleep logger. (H) While 
on land, the bimodal distribution of pitch angles 
measured with the accelerometer was used to 
determine when the birds were standing (“A”) or lying 
(“B"), separated by the green dashed line. (I) Mean 
(line) and standard deviation (shadow) activity 
[norm of the acceleration in (G)] of birds on land 
(red) and at sea (blue). The red dashed line 
indicates the threshold used to separate periods of 
activity from periods of inactivity. [Penguin photos 
in (A) and (H) by P.-A. Libourel] 
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Fig. 2. Waking and microsleep behaviors in chinstrap penguins. (A) Three 


summary plots of all the variables measu 


red in one penguin across 10 days. 


(Top) Diving depth (blue) measured with a pressure sensor. (Middle) Analyzed 
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periods when the penguin was incubating (red bar; 


red dots denote visual 


observations confirming incubation) or was not incu 


bating (orange bar), as 


determined by the penguin’s position (lying or standing near nest, respectively); 
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times at sea (blue bar); and a recovery period examined after returning from 
sea (purple bar). The black Xs show when the nest was visually checked, 

but the bird was not seen at the nest, the black Vs show when a video of the 
bird was obtained, and the blue square shows when the bird was seen on the 
shore. Distance from the nest is coded at the bottom with a blue (near) to yellow 
(far) scale. (Bottom) Hypnogram showing time spent in the following states: 
active wake (dark red), quiet wake or REM sleep (lighter red), BSWS (blue), 
USWS in the left (purple) or right (green) hemisphere, resting at sea (gray), and 
undetermined (white). (B) Recordings showing typical EEG activity during 
BSWS (blue), USWS left (purple), USWS right (green), and the corresponding 


8.11 hours (range: 3.03 to 42.53 hours) (data 
S1). During these trips, the penguins frequent- 
ly dove to a depth of 52.75 + 19.72 m, primarily 
during the day. After a foraging trip, the 
birds spent 11.43 + 8.7 hours (range: 1.05 to 
44.13 hours) on land near the coast or stand- 
ing near the nest before returning to the nest 
to switch with their partners. 

Direct observations and accelerometry (Fig. 
2A and data S4) reveal that the birds remained 
lying (incubating) for hours while at the nest. 
Sleep occurred while standing or lying. As 
previously reported in several bird species (10), 
video recordings revealed an association be- 
tween SWS in a given hemisphere and closure 
of the contralateral eye (Fig. 2B and movie S3). 
Chinstrap penguins engaged in extremely short 
bouts of BSWS (2.26 + 0.41 s) and USWS (1.13 + 
0.07 s for the right hemisphere and 1.09 + 0.05 s 
for the left hemisphere) (data S4 to S7). Even 
when consecutive bouts of different types of SWS 
(BSWS or USWS) were counted as a single bout 
of SWS, the bouts lasted only 3.91 + 0.83 s (Fig. 
2C). The maximum SWS bout duration was 
34.22 + 11.73 s (data S6 and S7); however, the 
majority (71.94%) of SWS was obtained through 
bouts lasting <10 s (Fig. 2C). Despite engaging 
primarily in short bouts of SWS, the penguins 
accumulated 14.92 + 1.13 hours of SWS per day 
(data S5): 8.55 + 142 hours of BSWS, and 2.98 + 
0.56 hours and 3.38 + 0.68 hours of USWS for 
the left and right hemispheres, respectively. Con- 
sequently, each hemisphere obtained between 
115 and 12 hours of SWS per day. 

Although incubating and nonincubating 
penguins engaged in >600 bouts of SWS per 
hour (621.32 + 97.2 bouts) (Fig. 2D and data S8), 
incubating birds exhibited more SWS, consisting 
of more, but shorter, bouts of BSWS and more 
time in, and bouts of, USWS (Fig. 2E and data 
S8), a possible response to caring for the eggs. 
Nonetheless, both incubating and nonincubat- 
ing birds slept better during the middle third 
of the day (hours 8 to 16) (Fig. 2D and data 
S8 and S9). Notably, sleep fragmentation and 
asymmetry were lower during the middle third 
(data S8 and S9), and SWS intensity (sleep 
depth) was higher for both hemispheres dur- 
ing BSWS and for the sleeping hemisphere 
during left or right USWS (data S8 and S9). 
The increased intensity and overall quality of 
sleep during the day might be mediated by 
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(orange) birds (N = 11). 


B Percentage of SWS 
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state of the eyes (right eye closed, purple bar; left eye closed, green bar). 
Bouts of continuous SWS (BSWS and USWS) are marked by the dashed lines. 
(C) Density of SWS bout durations and the cumulative percentage of SWS 
attributable to bouts of SWS with increasing durations computed from all 
bouts of SWS in all incubating birds pooled together (11 birds; 503,610 SWS 
bouts). (D) Percent SWS and the number of SWS bouts for each hour of 

the day and night for all birds on land (N = 11). (E) (Left to right) Percentage 
of SWS, number of BSWS bouts, BSWS bout durations, percentage of USWS, 
and number of USWS bouts between incubating (red) and nonincubating 
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Fig. 3. The relationship between sleep and nest position within the colony. (A) Example of nests located in 
the center (purple dots, N = 7) and at the border (blue dots, N = 4) of the colony. Scale bar, 20 m. (B to 1) Comparison 
of various sleep parameters between incubating birds nesting in the center or at the border of the colony. 


potential diel variation in the activity of skuas 
(18). However, in this case, one would expect the 
changes in sleep to be greater in birds incubat- 
ing and defending their eggs. Instead, environ- 
mental factors experienced by all birds likely 
influence the regulation of sleep intensity. 

To further investigate the potential impact 
that predation pressure has on sleep in chinstrap 
penguins, we compared sleep in birds nesting 
in the center of the colony with that of birds 
nesting more exposed to skuas at the colony 
border. Nests were classified as either on the 
border or in the center (>2 m away from the 
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border) of the colony (Fig. 3A and fig. S1). Birds 
nesting at the border obtained more SWS (Fig. 
3B), through fewer (Fig. 3C) but longer bouts 
of SWS (Fig. 3D) than those in the center of the 
colony. The maximum duration of SWS was 
also longer in birds incubating at the colony 
border (Fig. 3E), and birds at the border slept 
more deeply than those in the center (Fig. 3F). 
Also, none of the variables for USWS (percent 
of time, number of bouts, and bout duration; 
Fig. 3, G to I) varied significantly as a function 
of nest position (fig. S2). Consequently, contrary 
to predictions based on mallards (0), birds 
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Fig. 4. Sleep during and after trips to sea. (A) Example of BSWS (dashed 
boxes) in a penguin floating on the surface of the sea. The accelerometry signals 
(XYZ) occurring during sleep likely reflect movement of the water surface. The bout 
of sleep is interrupted by a brief awakening associated with faster movements 


nesting at the colony border slept better (more, 
deeper, and less fragmented sleep) than those 
nesting in the center, and they did not engage 
in more USWS. 

We next examined the behavior of penguins 
at sea (movie S4), focusing on trips of >5 hours. 
Our accelerometry and depth data revealed 
quiet periods at the water surface when the 
only movement was apparently related to waves. 
Although movement artifacts often obscured 
the EEG, in three birds, we were able to iden- 
tify bouts of BSWS (Fig. 4A) and possibly REM 
sleep, but not clear bouts of USWS. The iden- 
tification of REM sleep was based on wake-like 
EEG activity, reduced muscle tone, and rotation 
along the lateral axis occurring immediately 
after bouts of SWS (fig. S3). As with bouts of 
REM sleep confirmed with video on land, the 
episodes at sea lasted only a few seconds. Our 
accelerometry data show that, despite being 
able to engage in BSWS and (possibly) REM 
sleep at sea, the penguins spent more time 
actively awake (e.g., diving) while at sea than 
on land (67.87 + 2.04% versus 2.29 + 3.38%) 
(data S4). Finally, the time spent in SWS dur- 
ing the first two hours after returning from sea 
correlated with the time spent at sea [coeffici- 
ent of determination (R”) = 0.2372, t-statistic 
P = 0.0404; Fig. 4B], indicating that the pen- 
guins needed to recover some of the sleep lost 
at sea when the duration at sea was >20 hours. 
Longer recovery periods (4 hours) did not re- 
veal a significant rebound, suggesting that re- 
covery mostly occurs during the first hours after 
returning from sea. 

Although bouts of avian SWS are known to 
be short compared with those in mammals 
(15), the acquisition of SWS primarily through 
thousands of microsleeps lasting only 4 s is un- 
precedented, even among penguins (J4, 16, 17). 
Captive, nonbreeding emperor penguins (Apte- 
nodytes forsteri) also exhibit periods with fre- 
quent alternation between waking and SWS 
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spent 


EEG patterns, called “drowsiness” by Buchet 
and colleagues (16), that resemble the micro- 
sleeps observed in chinstrap penguins; however, 
these emperor penguins were found to spend 
only 14% of the time in a state of drowsiness, 
and they additionally spent 37.5% of the time 
in SWS characterized by bouts of continuous 
EEG slow waves lasting >4 min that occurred 
only 125 times per day (16). Similarly, little pen- 
guins (Eudyptula minor) recorded in small 
metabolic chambers exhibit bursts of slow waves 
during a state called “quiet wakefulness” by 
Stahel and colleagues (74) that resemble mi- 
crosleeps in chinstrap penguins, but bouts of 
SWS lasted 42 s on average. The difference be- 
tween sleep continuity in these penguins and 
chinstrap penguins is likely related to the re- 
cording context (captive and alone as opposed 
to wild in a colony) and the birds’ reproduc- 
tive state. 

Sleep in breeding chinstrap penguins was 
highly fragmented under all conditions and 
positions on land. This might reflect an overall 
state of antipredatory vigilance linked to being 
in a reproductive physiological state. Although 
predatory skuas usually target nests at the 
borders of penguin colonies (13), we cannot rule 
out the possibility that they sometimes also 
landed in gaps within the colony. Nonetheless, 
the predation pressure was still likely higher 
at the edge. Consequently, based solely on the 
findings in mallards (10), one would expect 
penguins at the border of the colony to sleep 
more poorly than those in the center. However, 
our findings suggest that the penguins’ already- 
disrupted sleep patterns are disturbed even 
more by intraspecific interactions within the 
colony (1, 12). This is likely driven by intraspe- 
cific aggression (11) and associated stress (19), 
as well as noise in the center of the colony. 

Although we did not directly measure the 
restorative value of microsleeps, the chinstrap 
penguins’ large investment in microsleeps, 
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evident in the accelerometry. (B) The relationship between time at sea and the time 
in SWS during the first two hours after returning to land. As penguins returned 
at various times of the day, the quantity of sleep was normalized relative to that 

occurring during the same time on a day not immediately preceded by a trip to sea. 


characterized by potentially costly momen- 
tary lapses in visual vigilance (eye closure), and 
their ability to successfully breed, despite sleep- 
ing in this highly fragmented manner, suggest 
that microsleeps can fulfill at least some of the 


restorative functions of sleep. The momentary ‘ 


neuronal silence that gives rise to each slow 
wave might provide windows for neuronal rest 
and recovery (2, 20), the benefits of which could 
accumulate irrespective of the duration of SWS 
bouts. Accordingly, this may give animals the 
flexibility to partition sleep into short or long 
bouts, depending on their ecological demands 
for vigilance. 
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Theories of planet formation predict that low-mass stars should rarely host exoplanets with masses 
exceeding that of Neptune. We used radial velocity observations to detect a Neptune-mass exoplanet 
orbiting LHS 3154, a star that is nine times less massive than the Sun. The exoplanet’s orbital 

period is 3.7 days, and its minimum mass is 13.2 Earth masses. We used simulations to show that 

the high planet-to-star mass ratio (>3.5 x 107°) is not an expected outcome of either the core accretion 
or gravitational instability theories of planet formation. In the core-accretion simulations, we show 
that close-in Neptune-mass planets are only formed if the dust mass of the protoplanetary disk is an 
order of magnitude greater than typically observed around very low-mass stars. 


ow-mass red dwarf stars—those with 

spectra classified as M dwarfs—are the 

most common stars close to the Sun and 

throughout the Milky Way Galaxy (/, 2). 

Gas giant planets are much rarer around 
M dwarfs than the more massive F, G, and K 
dwarf stars (3), and the vast majority of plan- 
ets orbiting M dwarfs are less massive than 
Neptune (4, 5). Few planets have been detected 
orbiting the least massive (<0.25 solar masses) 
and coolest M dwarfs, known as very low-mass 
dwarfs. This is because very low-mass dwarfs 
are faint and emit most of their radiation at 
infrared wavelengths, at which exoplanet de- 
tection techniques are less sensitive than at 
optical wavelengths. 

The planetary systems around very low-mass 
dwarfs TRAPPIST-1 (6) and Teegarden’s star 
(7) both contain compact systems of small 
(probably rocky) planets. The formation of 
such systems is compatible with the core- 
accretion theory of planet formation (8-/1), 
within which the outcome depends strong- 
ly on the total mass of small solid particles 
(dust) within the protoplanetary disk from 
which the planets formed (9). Observations 


of dust disk masses of protoplanetary disks 
have shown that typical disk dust masses 
are lower than required to explain observed 
planetary systems around other stars (12, 13). 
Protoplanetary disk dust masses are observed 
to scale with stellar mass (74, 15), implying that 
the disks around very low-mass stars might 
have dust masses sufficient to form Earth-mass 
planets but not giant planets. However, the 
uncertainties in theoretical models and the 
large dispersion in observed dust masses are 
consistent with a small fraction of low-mass 
stars hosting close-orbiting planets with a 
mass of 210 Earth masses (Mg). 

Massive planet candidates have been de- 
tected around a few very low-mass dwarfs, but 
in all cases, the planets have very wide orbits. 
Examples include GJ 3512 b [mass, >0.46 Jupiter 
masses (Mjup); orbital period of 203 days] (16) 
and TZ Ari b (mass, >0.21M;,,,; orbital period 
of 771 days) (17). These gas giants were inter- 
preted as having formed through a mechanism 
other than core accretion, such as gravitational 
instability within a massive gaseous outer disk, 
which produces more wide-orbiting planets 
than close-orbiting planets (18). Giant planets 


A planet orbiting LHS 3154 


We observed the low-mass M dwarf LHS 3154 
(coordinates are provided in Table 1), located 
15.7531 + 0.0084 pc from the Sun. We used the 
Habitable-zone Planet Finder (HPF) (79, 20), a 
near-infrared spectrograph (resolving power 
R = 55,000) on the 10-m Hobby-Eberly Tele- 
scope (HET) at McDonald Observatory in 
Texas, USA (2/, 22). The observations were 
taken as part of a survey designed to search 
for planets around very low-mass dwarfs (19). 
We obtained 137 spectra between 23 January 
2020 and 13 April 2022, from which we mea- 
sured the radial velocity (RV) (Fig. 1A) (23). A 
periodogram of the RV data (Fig. 1B) indicates 
a periodic Doppler shift, which we interpret as 
being due to a planet with an orbital period of 
3.7 days; the corresponding false-alarm prob- 
ability is <0.1%. The RV residuals (Fig. 1, B and 
E) contain no evidence of another planet in the 
system. The only other peak in the periodogram 
with <0.1% false alarm probability is a 1-day alias 
of the 3.7-day period (Fig. 1C). 

We fitted a one-planet orbital model to the 
RV data (23). The measured properties of LHS 
3154 and our inferred properties of the orbit- 
ing planet, LHS 3154 b, are summarized in 
Table 1. We estimate the stellar mass and ra- 
dius as 0.1118 + 0.0027 solar masses (M5) and 
0.1405 + 0.0038 solar radii (Ro), respectively, 
on the basis of scaling relationships for M 
dwarfs (24, 25). All uncertainties are 1o unless 
stated otherwise. We determined the stellar 
effective temperature (Trp) aS 2861 + 77 K 
using the HPF-SpecMartcu (26) software, which 
compares a given HPF spectrum with a library 
of other spectra with known properties. The 
metallicity of the star (its proportion of ele- 
ments heavier than helium), which is difficult 
to constrain for very low-mass stars, is consistent 
with the Sun’s metallicity (23). The low eccen- 
tricity of LHS 3154b, e = 0.07670 05" | is con- 
sistent with a circular orbit at 95% confidence. 


Excluding nonplanetary explanations 


Stellar activity, the intrinsic variations of a 
star’s atmosphere, can produce apparent radial- 
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velocity shifts that can be misidentified as a 
planetary signal (27). We used several metrics 
to assess the stellar activity of LHS 3154— 
including the differential line width indicator 
and the chromatic index indicators (7)—and 
found no correlation between the RV signal 
and activity indicators in the HPF spectra (23). 
We obtained an additional optical spectrum of 
LHS 3154 using the Low-Resolution Spectro- 
graph (LRS2) (28) on the HET (resolving power 
R = 2500), which showed no evidence of emis- 
sion in the Ha line. Previous work (29) has 
shown that very low-mass M dwarfs without 
detectable Ha emission rotate slowly; we used 
their scaling relationship to estimate the ro- 
tation period of LHS 3154 as 114 + 22 days. 
Such slow rotation is consistent with the nar- 
row stellar lines in the HPF spectra. Time-series 
photometry from the Transiting Exoplanet 
Survey Satellite (30) shows no evidence of 
stellar rotation-induced photometric varia- 
bility at short periods (<10 days), nor of flar- 
ing. Time-series photometry over a longer time 
span from the Zwicky Transient Facility (ZTF) 
(31) shows no evidence for variability on time 
scales near the 3.7-day period of the planetary 
signal but does show evidence for variations on 
time scales between 90 and 140 days (23), which 
is consistent with our estimated rotation rate. 
We therefore conclude that LHS 3154b is a 
slowly rotating inactive star and that the 3.7-day 
RV signal is not caused by stellar activity. 
The model fitted to the RVs indicates that 
LHS 3154b has a minimum mass of msini = 
13.15'985 Me, where m is the planet mass 
and 7 is the orbital inclination (which is un- 
known). To derive an upper limit on the planet’s 
mass, we used astrometric information from 
the Gaia spacecraft. A sufficiently massive com- 
panion would induce detectable astrometric 
motion, which would appear as excess astro- 
metric noise in the Gaia data (32). For LHS 3154, 
the Gaia Data Release 3 (33) catalog reported 
an excess astrometric noise of 316 micro-arc sec 
with a significance of 27.70. However, the data 
release does not specify the time scale of the 
astrometric variability, making it impossible to 
determine whether it is from the 3.7-day planet 
or an additional long-period companion. Gaia 
astrometry for very red stars is known to be 
affected by larger systematic errors than for 
Sun-like stars (34, 35), so the significance of 
the astrometric excess noise is likely overesti- 
mated. The Gaia renormalized unit-weight error 
(RUWE) parameter accounts for the color- 
dependent systematic issues and has been 
shown to be a better indicator of a compan- 
ion object than is the astrometric excess (34). 
LHS 3154’s RUWE value of 1.12 (33) is con- 
sistent with a single star. Even so, if we assume 
that all of the observed excess astrometric 
noise is due to a planet on a 3.7-day orbit, and 
assuming a single-star astrometric solution 
(36), we obtain a 30 mass limit of <32Mjup. 
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Table 1. Properties of the star LHS 3154 and planet LHS 3154b. The stellar parameters were derived 
by using the HPF-SpecMatcH code applied to the HPF spectra (23) and scaling relations for M dwarfs 
(24, 25). The Ha emission is given as the logarithmic ratio of Ho: luminosity (Ly,,,) to the overall bolometric 
luminosity (L,9)). The planet parameters were derived from an orbital model fitted to the HPF RVs (23). 
Median values are listed, and uncertainties denote the 68% credible intervals. Right ascension and declination 
coordinates are on the International Celestial Reference System (ICRS) at epoch 2016.0. BJDypg is the 
barycentric Julian date. The rotation period is estimated from a scaling relationship for inactive M stars (29). 
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This rules out a stellar binary companion as 
the origin of the RV variations. Only inclina- 
tions <0.2° would lead to a mass above 13Mjup, 
the approximate minimum mass for deuterium 
fusion, which is sometimes used to distinguish 
giant planets from brown dwarfs. Although 
such an orientation would occur in only ~10~° 
of randomly oriented orbital planes, RV sur- 
veys are known to detect such face-on systems 
and misidentify them as planet candidates 
(37). Given the known low occurrence rate of 
brown dwarfs on short-period orbits (38), and 
the astrometric constraint of <32Mjup, we favor 
the interpretation that LHS 3154 b is a plane- 
tary mass object. 


Comparison with planet formation models 


In Fig. 2, we compare the planet-to-star mass 
ratio of LHS 3154b with planets around other 
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very low-mass M dwarfs, restricted to those with 
masses known to within 30%. We used LHS 
3154b’s mass ratio of 3.5 x 10“ to test theories 
of planet formation around low-mass stars. 
In the core-accretion model of planet forma- 
tion, planets grow from initial over densities 
(known as cores) in a protoplanetary disk, which 
accrete dust and gas from the surrounding disk. 
Models in which initial cores grow through 
the accretion of ~1-km-sized solid bodies (called 
planetesimals) (8-10, 39), or by accreting pebble- 
sized material (40, 41), predict that very low- 
mass stars are only capable of forming compact 
systems of rocky planets on short-period or- 
bits. The maximum mass of planets formed 
through core accretion in simulations of the 
planetesimal-driven scenario around low-mass 
stars is about 5M (9), and 3Mgq in the pebble 
accretion-driven scenario (40, 41). With a 
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Fig. 1. HPF radial velocity observations of LHS 3154b. (A) Radial velocity 
(RV) as a function of Universal Time date (black data points) and the orbital 
model fitted to the data (red line). The parameters of the model are listed 

in Table 1. (B) Residuals (black data points) between the RV observations and 
the model. The gray dashed line indicates 0 ms“. (C) Periodogram of the RV 
data (black; arbitrary units). A peak at 3.7 days is labeled in red. The vertical red 


dashed line indicates the 1-day alias of the 3.7 peak. The black horizontal dashed line 
indicates the 0.1% false-alarm probability (FAP) line. (D) Phase-folded radial 
velocities. The best-fitting period P and RV semiamplitude K are listed at top right 
and in Table 1. The red curve is the best-fitting model, and the grey shading indicates 
the lo and 3o credible regions, respectively. (E) Phase-folded residuals. All error 
bars denote lo uncertainties. 


Fig. 2. Planet-to-star mass ratios for planets 
orbiting very low-mass stars. The sample is 
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Fig. 3. Results from our simulations of core-accretion planet formation. 
Planet mass is shown as a function of orbital distance in astronomical units 
(au). Circles indicate results from simulated systems, 300 in each panel, 
colored by the disk dust mass in that simulation. LHS 3154b is indicated with 
an orange arrow in (A) to (D), with its base indicating the minimum mass 

of 13.2 Earth masses. Gray boxes highlight the region and corresponding 
frequencies of close-in Neptune-mass planets (periods from 1 to 10 days and 
masses from 10 to 100 Earth masses) formed in the simulations. (A) Results 


Mgust from a distribution with a median of Maust = 0.8Mg and a disk power law 
index of y = 1.0. (B) Same as in (A) but with higher median of the disk dust 
mass distribution, by an order of magnitude, with Maust = 8Mg. (C) Same as 
in (A) but with y = 1.5. (D) Same as in (B) but with y = 1.5. Typical assumptions 
of planet formation (A) are incapable of forming planets as massive as LHS 

3154b around 0.1Mo stars. To form LHS 3154b-mass planets requires us to 
increase the mass of the disk (B), preferably in more compact disks (D) that 
have higher dust surface densities close to the star, facilitating formation of 


from typical assumptions (23), drawing the protopla 


minimum mass of 13.2M@, LHS 3154b is diffi- 
cult to explain with core-accretion models. 


Core-accretion simulations 


The outcomes of planet-formation models de- 
pend sensitively on the assumed protoplanetary 
disk properties, especially the total disk dust 
mass and its surface-density distribution as a 
function of distance from the star. We performed 
planet-formation simulations for the LHS 3154 
system based on the core-accretion scenario by 
modifying a previous model (9) to include gas 
accretion (supplementary text) (23, 42). 

We show in Fig. 3 the resulting simulation 
outcomes for four combinations of assumed 
total disk mass and disk surface density dis- 
tribution, the latter parameterized by a power 
law index y. Results are shown in Fig. 3A from 
simulations in which the disk dust masses 
are drawn from a distribution of masses 
consistent with observations for a 0.1M) 
star (15) and a nominal surface density dis- 
tribution with power law index y = 1 (23). 
Simulations with these parameters do not 
form any close-orbiting planets as massive as 
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LHS 3154b. The results from simulations in 
which the median mass of the disk dust mass 
distribution has been increased by an order of 
magnitude are shown in Fig. 3B. Simulation 
results are shown in Fig. 3, C and D, for the 
same mass distributions as in Fig. 3, A and B, 
but in more compact disks (y = 1.5). A small 
number of planets with properties similar to 
those of LHS 3154b are produced in the two 
simulations with higher total dust mass distrib- 
utions (Fig. 3, B and D). In those simulations, 
the larger amounts of solid material produce 
on average more massive initial cores, which 
can grow into more massive planets. The more 
compact disks increase the density of dust 
close to the star, leading to more collisions 
that form close orbiting planets. However, we 
found that more compact disks alone, without 
larger dust disk masses (Fig. 3C), do not form 
planets similar to LHS 3154b in our simulations. 


Gravitational instability models 


Another possible way to form massive plan- 
ets is through gravitational instability, which 
has been invoked to explain massive gas giant 
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planets around low-mass stars, such as the 
wide-orbiting gas giant planet GJ 3512b (mass 
>0.46Mjyp; P = 203 days) (16). However, LHS 
3154b’s mass of 13.2Mg is much lower than 
the minimum mass of planets formed from gra- 
vitational instability: Simulations of gravi- 
tational instability around a 0.1M star found 
minimum mass fragments of “60M (16)—about 
five times larger than the lower limit for LHS 
3154b. Gravitational instability preferentially 
forms planets on wide orbits (J6). Although 
we cannot rule out the gravitational instab- 
ility mechanism, if LHS 3154b formed through 
gravitational instability followed by inward 
migration, it would require even greater proto- 
planetary disk masses than we considered 
above for the core-accretion scenario. 


High disk masses 


Both potential formation mechanisms require 
protoplanetary disks that have substantially 
greater dust masses than are typically observed 
around very low-mass stars (15). One possible 
solution to this discrepancy is if a large frac- 
tion of the dust in protoplanetary disks around 
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low-mass stars grows to centimeter sizes or 
more (43, 44); pebbles of that size would not be 
detected by the millimeter observations used 
to estimate the overall dust masses, causing 
them to underestimate. Another possibility is 
that disks accrete large amounts of addition- 
al material from the surrounding parent mol- 
ecular cloud (72). A third possibility is that 
protoplanetary cores form soon (within 1 million 
years) after the host protostar, when protoplan- 
etary disks are expected to be more massive than 
at later times. This would enable runaway accre- 
tion of gases and thereby the formation of a gas 
giant planet (73). 
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Ribosomal stalk-captured CARF-RelE ribonuclease 
inhibits translation following CRISPR signaling 


Irmantas Mogila'+, Giedre Tamulaitiene’{, Konstanty Keda’, Albertas Timinskas?, 
Audrone Ruksenaite’, Giedrius Sasnauskas', Ceslovas Venclovas?, 
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Prokaryotic type III CRISPR-Cas antiviral systems employ cyclic oligoadenylate (cA,) signaling to activate 
a diverse range of auxiliary proteins that reinforce the CRISPR-Cas defense. Here we characterize a class 
of cA,-dependent effector proteins named CRISPR-Cas-associated messenger RNA (mRNA) interferase 1 
(Camil) consisting of a CRISPR-associated Rossmann fold sensor domain fused to winged helix-turn- 
helix and a RelE-family mRNA interferase domain. Upon activation by cyclic tetra-adenylate (cA,), Camil 
cleaves mRNA exposed at the ribosomal A-site thereby depleting mRNA and leading to cell growth 
arrest. The structures of apo-Camil and the ribosome-bound Camil-cA, complex delineate the 
conformational changes that lead to Camil activation and the mechanism of Camil binding to a bacterial 
ribosome, revealing unexpected parallels with eukaryotic ribosome-inactivating proteins. 


RISPR-Cas systems provide interference 
against invading foreign nucleic acids by 
different mechanisms (7). Type III CRISPR- 

Cas systems combine three different 
enzymatic activities to cope with viral in- 
fection: (i) CRISPR RNA (crRNA)-guided ribo- 
nuclease activity to target viral RNA transcripts, 
Gi) DNase activity to degrade viral DNA during 
transcription, and (iii) cyclic oligonucleotide 
(cA,) synthase activity to produce cA, signal- 
ing molecules that activate a diverse range of 
auxiliary effector proteins. The latter reinforce 
CRISPR-Cas defense by either killing the in- 
fected cell or leading to growth arrest (2, 3). 
The auxiliary effector proteins encoded in the 
vicinity of the CRISPR-Cas locus are typically 
composed of the cA,-sensing CRISPR-associated 
Rossmann fold (CARF) or SMODS-associated 
and fused to various effector domains (SAVED) 
(SMODS.-associated) domains fused to a wide 
variety of effector domains including higher 
eukaryotes and prokaryotes nucleotide-binding 
(HEPN)-ribonucleases (Csm6 and Csx1), DNA 
nucleases (Can1, Can2, and Card1), protease or 
transcriptional regulator Csa3 (4). The CARF7 
clade of the CARF domain family shares a 
conserved CARF-wHTH core that is often fused 
to the RelE ribonuclease-like domain identified 
as a toxin in the type II toxin-antitoxin RelE- 
RelB family (5). Type II RelE family toxins are 
ribosome-dependent endoribonucleases (mRNA 
interferases), which bind the ribosomal A-site 
to cleave the translated mRNA. RelE and RelE- 
like toxins inhibit translation leading to cell 
growth arrest unless they are neutralized by the 
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cognate antitoxin, as exemplified by Escherichia 
coli RelE-RelB system (6, 7). In this work we 
aimed to elucidate the functional and struc- 
tural mechanisms of the CARF-wHTH-RelE 
proteins and their role in type III CRISPR- 
Cas immunity. 


Computational analysis of the CARF7 clade 


We enriched the previously defined CARF7 
clade (5) with new sequence homologs and 
obtained AlphaFold structural models for the 
entire set. Next, using both sequences and pre- 
dicted structures, we identified domain archi- 
tectures for each protein. Our analysis revealed 
that ~90% of CARF7 domains are fused to a 
WHTH domain (tables S1 and $2), as exempli- 
fied by the Crn1 ring nucleases Sso1393 from 
Saccharolobus solfataricus (8) and SisO811 from 
Sulfolobus islandicus (9). This two-domain 
CARF-wHTH core in most cases (~80%) is 
fused to additional effector domains, such as 
RelE (mRNA interferases), HD (phosphohy- 
drolases), PD-(D/E)XK (endonucleases) or 
DUF2103 (function unknown) (Fig. 1A, fig. S1, 
and table S3). Phylogenetic analysis of CARF 
domains revealed that the same type of asso- 
ciated effector domain (e.g., RelE) may be 
present in different subclades, whereas proteins 
in a single subclade, comprising closely related 
CARF domains, may feature different effector 
domains. This observation suggests that fusion 
events occurred multiple times independently 
during evolution. 

In this study we focused on proteins rep- 
resenting the dominant CARF-wHTH-RelE 
architecture (Fig. 1B), and consequently we 
investigated the diversity of CARF7-associated 
WHTH and RelE domains. Both sequence (fig. 
$2A) and structure (Fig. 1C and fig. S2B) com- 
parison of WHTH domains revealed that those 
in CARF-wHTH-RelE proteins are most closely 
related to WHTH domains in CARF-wHTH- 
DUF2103, suggesting their functional similar- 
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toxin domains, are more distant (Fig. 1C and 
fig. S2). RelE domains fused to the CARF-wHTH 
core, though forming a nonhomogenous group, 
clearly differ from previously established RelE 
protein families (fig. S4, A and B), likely due to 
their evolutionary history as part of multi- 
domain proteins. 

Owing to the architecture and genome loca- 
tion of CARF-wHTH-RelE proteins, we named 
them Camil (CRISPR-Cas-associated mRNA 
interferase 1) (Fig. 1B). For further experimen- 
tal analysis, we selected four Camil proteins from 
Allochromatium vinosum (AvCamil), Caldilinea 
aerophila (CaCamil), Candidatus Cloacimonas 
acidaminovorans (CCaCamil), and Caldicellulo- 
siruptor hydrothermalis (ChCamil), which are all 
encoded near type III CRISPR-Cas loci and 
belong to different sequence clusters in the : 
CARF?7 clade (Fig. 1D). 


Camil is a ribosome-dependent mRNA 
interferase regulated by cA, signaling 


We expressed selected Camil proteins in E. coli, 
purified them (fig. S5), and subjected them to 
further biochemical and structural characteri- 
zation. To understand the structural arrange- 
ment of Camil, we solved the crystal structure 
of AvCamil at 1.7 A resolution (table $4). The 
structure revealed a predicted three-domain 
architecture with the N-terminal CARF domain, 
the middle WHTH domain, and the C-terminal 
RelE domain (Fig. 2A). The AvCamil CARF 
domain is most similar to that of the SsCrn1 
ring nuclease (PDB ID: 3QYF, DALI Z score 
18.0, RMSD 2.2 A) (fig. S6A) which cleaves cAy, 
(8). SsCrn1 active site residues S11 and K168 (8) 
are conserved in AvCamil (S11 and K157), 
implying that AvCamil could possess intrinsic 
ring nuclease activity. Indeed, biochemical anal- 
ysis revealed that cA,—but not cAg—was cleaved 
by Camil proteins (Fig. 2B and fig. S6, B and C). 
Wild-type (WT) AvCamil converted cA, initially 
to the linear intermediate A,>p and subse- 
quently to the final product A,>p, similar to 
other ring nucleases and CARF effectors (fig. 
S6D) (8, 10-12). AvCamil mutations S11A and 
K157A in the AvCamil active site compromised 
cA, cleavage (Fig. 2B). 

The structure of the AvCamil RelE domain 
is similar to the RelE family toxins, the closest 
match being Vibrio cholerae toxin VcHigB2 
(PDB ID: 5JA9, DALI Z score 8.9, RMSD 2.6 A) 
(13) (Fig. 2C). Structural superposition revealed 
that VcHigB2 (13) active site residues are con- 
served in the AvCamil RelE domain (Fig. 2C). 
AlphaFold models of CaCamil and CCaCamil 
revealed that they are similar to other RelE 
family proteins (fig. S7A). Whereas most RelE 
family toxins are monomeric proteins (J4), some 
such as the YoeB proteins from Escherichia coli 
(PDB ID: 6OXA) and Staphylococcus aureus 
(PDB ID: 7CUA) are dimers; however, even 
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Fig. 1. Camil proteins are dispersed within the CARF7 clade. (A) Phylogenetic 
tree of the CARF7 clade based on the nonredundant set (n = 385) of CARF domain 
sequences. Tree leaf coloring corresponds to the type of effector domain fused 

to the CARF-wHTH core. Camil proteins, selected for characterization in this study, 
and previously characterized Crnl proteins (Ssol393 and SisO0811) are indicated. 


then only one subunit is used for mRNA cleav- 
age (15, 16). AvCamil forms a dimer both in 
solution (fig. S5, B and C) and in the crystal (Fig. 
2A); however, the dimerization interface dif- 
fers from YoeB (fig. S7B). AlphaFold models of 
CaCamil, CCamil, and ChCamil suggest mono- 
meric structure of their RelE-like domains (fig. 
S7C). Taken together, structural and biochem- 
ical data suggest that Camil could function as a 
cA,-activated RelE ribonuclease that cleaves 
mRNA in a ribosome-dependent manner simi- 
lar to RelE family toxins. 

To test this hypothesis, we adapted an assay 
previously used to monitor EcRelE-mediated 
RNA cleavage in programmed ribosomes (77). 
Because EcRelE in vitro most efficiently cleaves 
the stop codon UAG and sense codons CAG 
and UCG (7), we programmed E. coli '70S ribo- 
somes with a short radiolabeled mRNA sub- 
strate rib-UAG (table S5) and tRNA™* to lock 
the AUG codon of mRNA in the ribosome P-site, 
thereby exposing the downstream UAG stop 
codon in the A-site (Fig. 2D). AvCamil did not 
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cleave mRNA alone or bind to the ribosome; 
however, the cleavage of ribosome-bound mRNA 
was induced by cAy. CaCamil, CCaCamil, and 
ChCamil also cleaved mRNA in a ribosome- 
and cA,-dependent manner (fig. $8, A and B). 
Mapping of the cleavage site revealed that 
AvCamil cleaves the mRNA primarily after the 
third nucleotide of the UAG codon, similar to 
Mycobacterium avium RelE and E. coli YhaV 
(78, 19) (Fig. 2D). At increased protein concen- 
tration the cleavage has been detected both 
after the second and third nucleotide (fig. S8B). 
Meanwhile, CaCamil and ChCamil, like most 
RelE toxins, cleaved the substrate after the 
second nucleotide (6) (fig. S8B). Camil reactions 
were specific for the mRNA (fig. S8C) and a 
single AvCamil protein was capable of cleav- 
ing multiple mRNA molecules (fig. S8D). Mass 
spectrometry analysis showed that AvCamil, like 
EcRelE (20), cleaves mRNA leaving a 2',3'-cyclic 
phosphate terminus (fig. S9). AvCamil RelE 
active site mutations H306A, K317A, R325A, and 
H348<A resulted in a 5- to 150-fold reduction in the 
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(B) Domain organization and relative abundance of RelE domain-containing proteins 
in the analyzed CARF7 protein set. CC, coiled coil. (©) wHTH domains of the 
CARF7 proteins clustered according to structural model similarity. Coloring 
corresponds to that in panel (A). (D) Loci of type Ill CRISPR-Cas systems associated 
with RelE-like effectors selected for characterization. 


mRNA cleavage rate (Fig. 2E). In line with the 
proposed RelE mRNA cleavage mechanism, 
which involves a nucleophilic attack by a 2'-OH 
group on the adjacent phosphate (6, 20), Camil 
did not cleave 2’-O-methylated mRNA (fig. S10, 
A and B, and table S5). mRNA substrates con- 
taining random nucleotides (rib-NNN) instead 
of UAG at the A-site were also cleaved by 
Camil (fig. SIOC and table S5), in agreement 
with the relaxed in vivo sequence specificity of 
EcRelE (18). mRNA cleavage was stimulated 
by the linear reaction intermediate A,>p with 
the same efficiency as cA, (fig. SIOD) implying 
a “timer” activation mechanism proposed for 
the Csm6, by which the cA,-bound effector re- 
mains active whereas cA, is rapidly cleaved to 
A4>p, and is inactivated only upon slow con- 
version of Ay>p to A,>p and A,>p release 
(11, 12). In the case of AvCamil, cA, degradation 
was not stimulated by either ribosome binding 
or mRNA cleavage (fig. S11). Collectively, we 
show here that Camil is a ribosome-dependent 
mRNA interferase regulated by cA, signaling. 
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(A) Domain arrangement and overall structure of AvCamil. Conserved residues an mRNA containing an A-site UAG codon, a P-site tRNA™*, and molar excess 


characteristic to the different domains subjected to alanine mutagenesis are 
indicated below the boxes. (B) cAq hydrolysis by AvCamil in vitro. (©) Comparison 
of the active site of AvCamil (magenta) and VcHigB2 (gray, PDB ID: 5JA9). AvCamil 
residues that correspond to the active site residues of VcHigB2 are shown. 

(D) Cleavage activity of AvCamil in the presence of cAy. Reactions contained the 


Structural mechanism of cA,-activated 
Camil-ribosome interaction 

To determine the structural and molecular 
mechanism of mRNA interference, we solved 
a cryo-electron microscopy (cryo-EM) structure 
of AvCamil-S11A-H343A bound to E. coli 70S 
ribosome in complex with mRNA and tRNA™* 
in the presence of cA, (Fig. 3, A and B, fig. 
$12, and table S6). The complex contains two 
tRNA™ molecules, one bound in the E site, 
the other to the cognate AUG codon in the P- 
site. One AvCamil subunit occupies the ribo- 
somal A-site and its RelE domain interacts 
with the 16S (h18, h30-31, h34, and h44) and 
the 23S (H69) rRNA as well as with the P-site 
tRNA™ similar to other RelE family toxins 
(21) (figs. $13, A to C). The RelE domain of 
another AvCamil subunit makes additional 
contacts with h32 of 16S RNA and its WHTH 
domain makes contacts with H43 of 23S RNA. 
The CARF domains of both subunits contact 
the sarcin-ricin loop (SRL) H95 of 23S RNA. The 
Camil dimer also interacts with the ribosomal 
proteins uS12, uS19, and uL1l (fig. S13B). mRNA 
is flexible in the vicinity of the AvCamil binding 
site, and consequently only three nucleotides 
of mRNA interacting with the P-site tRNA 
were modeled. Conformation of the decod- 
ing center of the AvCamil-ribosome com- 
plex resembles the empty A-site 70S ribosome 
(PDB ID: 4V6G) (22), whereas overall ribo- 
some conformation is most similar to the 
M. tuberculosis HigB™* toxin-ribosome com- 
plex structure (PDB ID: 7NBU) (21) (figs. S13, 
D and E). 
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OH and Tl, alkaline hydro 


cA, is bound in the central cavity of the CARF 
domain dimer, where it adopts an approxi- 
mately square planar conformation (fig. S14A). 
Structural comparison with the apo AvCamil 
structure revealed relatively small conforma- 
tional differences in the CARF dimer, the lar- 
gest change being a 35° tilt of the wHTH 
domains; a similar corkscrew conformational 
change of WHTH domains was also observed 
in the post-catalytic structure of cA, ring nu- 
clease SisO811 (9) (Fig. 3C and movie S1). 
Moreover, the entire RelE dimer is shifted 
~15A diagonally relative to the CARF-wHTH 
dimer (Fig. 3C). This breaks the symmetry of 
the AvCamil dimer but enables one RelE do- 
main to fit into the ribosome cleft in the vici- 
nity of the mRNA (movie S1). Such a binding 
mode would be unfeasible for the symmetric 
apo-AvCamil because of steric clashes of the 
CARF domain with SRL and uS12 (fig. S14B). 
Efficient stimulation of AvCamil mRNA cleav- 
age activity by the noncleavable cA, analog 2’ 
F,-cA, (fig. S1OD) suggests that conformational 
changes of AvCamil required for its activa- 
tion are coupled to cA, binding but not cA, 
hydrolysis. 

We found that the AvCamil WHTH domain 
exposed on the ribosome surface interacts with 
the C-terminal domain (CTD) of the bL12 ribo- 
somal protein (Fig. 3D). Bacterial protein bL12 
is a key component of the ribosomal stalk 
(23, 24) which is conserved in all domains of 
life and plays an essential role in recruitment 
and activation of translational GTPase factors 
(trGTPases) required for ribosome subunit 


1 December 2023 


of AvCamil. A schematic of the Camil mRNA cleavage experiment is shown on 
the right. The 31-nucleotide (nt) R 
element (RBS) followed by a spacer and the P-site (AUG) and A-site (UAG) codons. 


|A substrate consists of a ribosome-binding-site 


ysis and RNase T1 digestion markers. (E) mRNA 


cleavage activity of AvCamil RelE domain active site mutants. 


association (bIF2), translation elongation (bEF- 
Tu, bEF-G) and termination (bRF3) (25, 26). 
High-speed atomic force microscopy studies 
confirmed the “factor pooling” mechanism by 
an archaeal ribosomal P-stalk, which assembles 
multiple trGTPases around the ribosome there- 
by increasing their local concentration (27). 
Despite the fact that trGTPases and AvCamil 
use different structural elements for bL12 CTD 
binding (28-37) (fig. S14C), superposition of 
FE. coli EF-G-ribosome complex (PDB ID: 
TN2C) (32) with the AvCamil-ribosome struc- 
ture showed that AvCamil binds at the ribo- 
somal A-site in the same position as EF-G (fig. 
S14D). This finding suggested that Camil may 
use the ribosome stalk-mediated capture mech- 
anism for entering the ribosomal A-site. To test 
this hypothesis, we evaluated AvCamil activity 
using bL12-deficient E. coli’ 70S ribosomes in the 
presence and absence of a standalone bL12 
protein (24, 33, 34). AvCamil showed only 
trace mRNAse activity with bL12-depleted 
70S ribosomes; however, the mRNA cleavage 
activity was reconstituted by supplementing 
bL12 protein to the bL12-depleted ribosomes 
(Fig. 3E). CaCamil, CCaCamil, and ChCamil 
followed a similar pattern (Fig. 3E), suggest- 
ing that the ribosome stalk-mediated capture 
is common for Camil proteins. The observed 
compatibility of E. coli ribosome with Camil 
from different microorganisms is consist- 
ent with high sequence identity (>50%) be- 
tween the L12-CTDs of E. coli, A. vinosum, 
C. aerophila, Candidatus C. acidaminovorans, 
and C. hydrothermalis bacteria (fig. S15A). 
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Fig. 3. Cryo-EM structure of AvCamil bound to the E. coli ribosome. 

(A) Top view of the 70S E. coli ribosome with the 50S (light blue) and 30S 
(gray) subunits surrounding AvCamil-S11A-H343A (at the A-site, yellow, 
turquoise, and magenta, respectively), cA, (cyan), tRNAs™*! (dark green), and 
mRNA (orange). The CTD of bL12 is violet. (B) Close-up view showing AvCamil 
CARF domain interactions with uL11 (30S) and uS12 (50S) proteins and 
WHTH domain interaction with bL12 protein (50S). RelE dimer is in the close 
vicinity to the 16S rRNA. One subunit of the RelE dimer approaches mRNA 

in the A-site. AvCamil, cAg, mRNA, tRNA™* and bL12 are colored as in (A). 
The rectangular frame with an arrow in (B) corresponds to its projection 


Mutations of AvCamil wWHTH and CARF resi- 
dues that make contacts to the conserved bL12 
CTD residues located in a4 and a5 helices (35) 
(Fig. 3D and fig. S15A) compromised mRNA 
cleavage to various extents (Fig. 3F and fig. S15B), 
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confirming the importance of bL12 CTD and 
WHTH interaction for the ribosome-dependent 
mRNA cleavage activity of AvCamil. We have 
also examined the interaction between bL12 and 
AvCamil using biolayer interferometry (BLI). 
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bL12 concentration (uM) 


depicted in (A), indicating the rotational relationship between the two views. 
(C) Conformational changes following superposition of dimeric AvCamil apo- 
(gray) and ribosome-bound cAq-activated (colored) states. (D) Interface 
between CTD of bL12 protein (violet) and wHTH domain (turquoise) of AvCamil. 
(E) mRNA cleavage by Camil using WT and bL12-depleted 70S E. coli 
ribosomes. (F) Ribosome-dependent mRNA cleavage activity of AvCamil-S11A 
WHTH domain mutant proteins. (G) Amplitudes of bL12 binding BLI signal 
normalized by AvCamil loading at various bL12 concentrations. bL12 binding 
by cAq-activated AvCamil-SI1A wHTH mutants were compared with AvCamil- 
SHIA in the presence and the absence of cAq. 


AyCamil was captured on an Ni°*-nitrilotriacetic 
acid sensor chip as the ligand and bL12 pro- 
tein was used as the analyte. The amplitude 
of bL12 binding obtained in the presence of cA, 
was considerably higher compared with that in 
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Fig. 4. Camil provides immunity and degrades mRNA in vivo. (A) The 
experimental setup to study Camil activity in vivo. E. coli bacteria 
coexpressing heterologous AvCamil protein and S. thermophilus type III-A 
CRISPR-Cas system (StCsm) were used for the bacterial growth assay (B) and 
mRNA integrity analysis (C). (B) Bacterial growth curves. The WT and 

S1IA mutant of AvCamil prevent the growth of bacteria, whereas the RelE-dead 
STIA+K317A variant permits growth. ‘Vector, a plasmid vector without 
AvCamil gene. (C) In vivo mRNA integrity analysis. Integrity of Ipp, ompA, 
ompF, tufA mRNAs, tmRNA, and 5S rRNA was analyzed by northern blot 

in total RNA extracted from E. coli after WT or mutant AvCamil and StCsm 
activation in vivo. v, no Camil; WT, wild-type AvCamil; S and SK, S11A and 
ST1IA+K317A mutant variants, respectively. (D) Model for the Camil-mediated 
antiviral defense. Infection is sensed by the type Ill CRISPR-Cas interference 


complex through the viral transcript binding. This triggers viral RNA and 
ssDNA degradation and synthesis of signaling molecules cAq. cAq binding to 
Camil CARF domains (yellow) AvCamil induces conformational changes 

of the protein that enable wHTH domain (turquoise) interaction with 
ribosomal bL12-stalk. Ribosomal stalk-captured Camil is delivered into the 
A-site of a translating ribosome. RelE domain (magenta) of Camil cleaves 

mRNA and stops translation of both cellular and viral mRNA. Camil also cleaves 
tmRNA, hindering one of the ribosome rescue pathways. Cleavage of cAy 

by ring nuclease activity of CARF domain inactivates Camil. (E) To bind the 
translating ribosome, Camil employs an active stalk-capture mechanism, 

similar to that of different prokaryotic and eukaryotic translational factors, ‘ 
bacterial tetracycline ribosomal protection proteins, and eukaryotic ribosome 
inactivating proteins. 


the absence of cAy, confirming the importance 
of cA, in stabilizing the AvCamil-bL12 interaction 
(Fig. 3G and fig. S15C). As expected, AvCamil 
wHTH surface mutants displayed no binding 
to bL12 in the presence of cA,. 

Though the exact mechanism behind the 
observed conformational differences between 
the apo- and ribosome-bound Camil remains 
to be determined, we propose that Camil binding 
to the ribosome may follow a two-step mecha- 
nism. First, cA, binding induces a conforma- 
tional transition of the CARF-wHTH domains 
similar to the ring nuclease SisO811 (9), making 
Camil wHTH domains available for the inter- 
action with bL12 ribosomal stalk protein. This 
facilitates the second step, i.e., recruitment of 
Camil to the ribosomal A-site with a concomi- 
tant rearrangement of Camil into an asym- 
metric dimer. 
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Camil provides immunity and degrades 

mRNA in vivo 

We used the bacterial survival assays to probe 
whether Camil contributes to the type III 
CRISPR-Cas immunity in vivo. We coupled Camil 
with the Streptococcus thermophilus type I0-A 
CRISPR-Cas system (StCsm) (36) that produces 
various cA, (including cA,) in response to trans- 
cription from foreign DNA (72). The three- 
plasmid IPTG-inducible system in E. coli (Fig. 
4A and fig. S16A) ensured production of Camil 
activator cA,, in the background of the cataly- 
tically dead Cas10 DNase and Csm3 RNase to 
avoid cell toxicity. In a spot dilution assay, 
transformation of the cA4-producing E. coli 
cells with pCamil plasmids encoding different 
WT Camil proteins yielded the same or slightly 
lower (2- to 6-fold) number of transformants as 
the empty vector (fig. S16, A and B). However, 
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transformation of pCamil encoding the AvCamil 
ring nuclease-deficient S11A variant resulted in 
1000 times fewer survivors, presumably because 
of a higher concentration of Camil-activating cA, 
within the cells (fig. SI6B). Inactivation of the 
AvCamil RelE domain by the K317A mutation 
restored bacteria survivability. Further, we moni- 
tored the growth of the £. coli cultures after 
induction of expression of different Camils. 
We observed that +WT AvCamil and +AvCamil- 
S11A cultures, but not +AvCamil-S11A-K317A 
with mutated RelE domain, exhibited a growth 
defect upon the induction of target transcrip- 
tion (Fig. 4B). Reduced cell growth was also ob- 
served upon induction of CaCamil, CCaCamil, 
and ChCamil, confirming the toxicity of Camil 
in vivo (fig. S16C). 

To monitor Camil mRNA cleavage activity 
in vivo we used the same set of E. coli hosts 
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and plasmids as in the bacterial growth assay 
(Fig. 4A). Cells were grown in a liquid medium; 
total RNA was extracted and target mRNAs of 
several abundant transcripts (table S5) were 
monitored using Northern blot. No mRNA deg- 
radation was detectable in cells lacking AvCamil, 
but RNA degradation at multiple sites was 
observed for all transcripts after induction of 
WT and S11A AvCamil and WT CaCamil ex- 
pression (Fig. 4C and fig. SI6D). This result 
shows that in vivo Camil cleaves mRNA with 
relaxed codon sequence specificity similarly 
to the EcRelE toxin (78). Stalled ribosomes in 
F. coli can be recovered by tmRNA, which 
releases ribosomes from damaged mRNAs and 
tags the nascent polypeptides from such ribo- 
somes for further proteolysis (37). Northern 
blot analysis of RNA extracted from cA4- 
producing cells expressing AvCami, CaCamil, 
and ChCamil revealed tmRNA cleavage prod- 
ucts (Fig. 4C and fig. S16D), indicating that 
cA,-activated Camil cleaves tmRNAs together 
with mRNA, thereby blocking ribosome recovery. 


Discussion 


We show here that the accessory type III 
CRISPR-Cas protein Camil is a RelE toxin-like 
ribosome-dependent RNA interferase that cuts 
mRNA in response to the cA, nucleotide signal 
produced by the type III CRISPR-Cas system 
(Fig. 4D). RelE family toxins are widely distrib- 
uted in the prokaryotic genomes and function 
as mRNA interferases that cleave translating 
mRNA associated with the ribosome at the 
ribosomal A-site (6). In bacteria, RelE toxicity 
is neutralized by the labile RelB protein through 
protein-protein interactions that disrupt the 
geometry of the critical RelE catalytic residues 
(38). Under deprivation of RelB antitoxin the 
ribosome-dependent RelE mRNA interferase 
is activated (39). We show here that the RelE-like 
toxin domain in Camil is neutralized through 
the CARF-wHTH domain fusion that makes 
RelE inactive. 

CARF/SAVED domains often function as 
sensors that activate effector domains through 
the cyclic oligonucleotide binding. Indeed, we 
show that cyclic cA, binding at the CARF do- 
main induces the ribosome-dependent mRNA 
interferase activity of RelE. It is likely that Camil 
evolved through domain shuffling between 
canonical RelE-RelB toxin-antitoxin systems 
and CARF-wHTH domain proteins. The func- 
tion of the CARF-domain as a cA, sensor is well 
established; however the role of the wHTH 
domain remained to be elucidated. 

We demonstrate here that Camil binding to 
the ribosome is assisted by the WHTH:bL12- 
stalk interaction (Fig. 4E). Despite the absence 
of substantial structural similarities between 
bacterial bL12, archaeal aP1, and eukaryotic 
ePleeP2, ribosomal stalk proteins (40) bL12- 
and P-stalks are functionally related to each 
other (22) and serve as hubs for the accelerated 
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delivery of translational factors to the ribosome 
A-site (Fig. 4E). In addition to prokaryotic and 
eukaryotic translational factors, bacterial tetra- 
cycline ribosomal protection proteins (TRPP 
proteins) that rescue translation in the presence 
of tetracyclines (47-43), and eukaryotic ribosome- 
inactivating proteins (RIPs) (44-46) also use 
the ribosome stalk-mediated capture to get to 
their binding sites on the ribosome (Fig. 4E). 
The biological function of ubiquitous RIPs is 
still discussed, but it has been shown that RIPs 
efficiently inhibit both plant and animal virus 
infections through ribosome inactivation or 
activation of different signaling pathways that 
cause infected-cell death thereby preventing 
virus propagation (47, 48). The demonstration 
that Camil uses RIP-like mechanism to inhibit 
translation upon capture by ribosome bL12/ 
eP-stalk implies a common antiviral strategy 
shared between eukaryotes and bacteria (49). 
Different applications of type II TA systems 
in biotechnology and human therapeutics in- 
cluding antimicrobials, antivirals, and cancer 
therapy have been proposed (39). Camil pro- 
teins characterized here provide a reservoir of 
tightly regulated bacterial toxins with potential 
biotechnological and therapeutic applications. 
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Time-resolved live-cell spectroscopy reveals EphA2 


multimeric assembly 


Xiaojun Shi?, Ryan Lingerak’?, Cameron J. Herting*, Yifan Ge°+, Soyeon Kim‘, Paul Toth®t, 
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Juha Himanen?, Dolores Hambardzumyan’°, Dimitar B. Nikolov**, 
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Ephrin type-A receptor 2 (EphA2) is a receptor tyrosine kinase that initiates both ligand-dependent 
tumor-suppressive and ligand-independent oncogenic signaling. We used time-resolved, live-cell 
fluorescence spectroscopy to show that the ligand-free EphA2 assembles into multimers driven 

by two types of intermolecular interactions in the ectodomain. The first type entails extended 
symmetric interactions required for ligand-induced receptor clustering and tumor-suppressive 

signaling that inhibits activity of the oncogenic extracellular signal-regulated kinase (ERK) and protein 
kinase B (AKT) protein kinases and suppresses cell migration. The second type is an asymmetric 
interaction between the amino terminus and the membrane proximal domain of the neighboring 
receptors, which supports oncogenic signaling and promotes migration in vitro and tumor 
invasiveness in vivo. Our results identify the molecular interactions that drive the formation of the 


EphA2 multimeric signaling clusters and reveal the 
its opposing functions in oncogenesis. 


s the largest family of receptor tyrosine 

kinases (RTKs), ephrin receptors (Ephs), 

with their membrane-tethered ephrin li- 

gands, form signaling complexes at cell- 

cell contact sites and regulate a wide 
range of biological processes during embryon- 
ic development and adult physiology (/-3). 
Ephrin type-A receptor 2 (EphA2) is the most 
affected Eph receptor in human malignancies. 
It is overexpressed in various human solid 
tumor types, including colon, breast, prostate, 
and lung cancer, as well as glioblastoma and 
melanoma, and this often correlates with poor 
prognosis (4-8). 

Cellular, biochemical, and genetic studies 
have characterized EphA2 as both a tumor 
suppressor and an oncogene. This dual func- 
tion is generally dictated by its ligand-binding 
status. When bound to ligand, EphA2, through 
canonical tyrosine kinase signaling, inhibits 
the RAS-extracellular signal-regulated kinase 
(ERK) and phosphoinositide-3-kinase (PI3K)- 
AKT pathways and inactivates integrin-mediated 
cell adhesion (9-12). These properties distinguish 
EphA2 from prototypic RTKs that activate AKT 
and ERK upon ligand binding. Indeed, activated 
EphA2 can effectively counter ERK and AKT 
activation by the epidermal growth factor re- 
ceptor (EGFR), platelet-derived growth factor 


pivotal role of EphA2 assembly in dictating 


receptor (PDGFR), and hepatocyte growth factor 
receptor (c-MET) stimulated by their respective 
ligands (10, 13, 14). Genetically, EphA2 deletion 
(knockout) in mice increases susceptibility to 
carcinogenesis in the mouse skin, supporting 
an intrinsic tumor-suppressive function (15). 
In its ligand-free state, EphA2 becomes a 
substrate for multiple serine-threonine kinases 
that phosphorylate EphA2 on S897 (pS897; 
where S is serine), including AKT, p90 ribo- 
somal S6 kinase (p9ORSK, a downstream tar- 
get of ERK), and protein kinase A (PKA) (13, 16). 
This noncanonical signaling event turns EphA2 
from a tumor suppressor into an oncogenic 
protein (13). Indeed, phosphorylation of S897 
is an important regulator of many malignant 
cell properties, including infiltrative invasion 
of glioma in vivo (17), metastases of non-small 
cell lung cancer (78), resistance to BRAF-targeted 
therapy of melanoma (J9, 20), chemotherapy 
resistance of ovarian cancer (27), and trans- 
endothelial invasion of bone marrow endothe- 
lium by prostate cancer (22). Thus, noncanonical 
signaling by EphA2 has an important role in 
promoting malignant progression (23-26). 
EphA2 is composed of an extracellular do- 
main (ECD), a transmembrane segment (TM), 
and an intracellular domain (ICD) that con- 
sists of a tyrosine kinase domain and a C- 


q 


terminal sterile « motif (SAM). The rigid | ie 
includes a ligand-binding domain (LBL,,-~— 
cysteine-rich domain (CRD) that consists of 
Sushi and EGF-like motifs, and two fibronec- 
tin type III repeats (FN1 and FN2) (27). Pre- 
vious studies showed that ligand engagement 
induces homotypic interactions between neigh- 
boring EphA2 molecules, which is proposed 
to induce the formation of large EphA2 clus- 
ters intercalated by ephrins (28-30). However, 
less is known about the molecular assembly 
of ligand-free EphA2, which initiates onco- 
genic signaling through S897. It remains chal- 
lenging to reliably characterize the molecular 
assembly of receptors in their native environ- 
ment such as the live cell membranes. Two 
studies using fluorescence resonance energy 
transfer (FRET)-based assays reported con- 
flicting results on the molecular assemblies 
of ligand-free EphA2 (3/, 32). 

We investigated the molecular assembly of 
EphA2 in the plasma membranes of live mam- 
malian cells with a time-resolved fluorescence 
spectroscopy called pulsed interleaved excitation- 
fluorescence cross-correlation spectroscopy (PIE- 
FCCS; Fig. 1A) (33, 34). PIE-FCCS measures the 
diffusion of receptors and provides multiplexed ‘ 
readouts of the degree of oligomerization, mo- 
bility, and density of membrane receptors. 
The fluorescence lifetime collected with PIE- 
FCCS further indicates conformational change ‘ 
within the receptor assemblies (Fig. 1B). These 
combined readouts of the different aspects of 
the molecular dynamics offer a view of the 
molecular properties of receptors in live cell 
membranes. Our results reveal that EphA2 
assembles into function-defining multimers 
through two symmetric head-head (HH) and 
an asymmetric head-tail (HT) homotypic EphA2 
interactions and help elucidate the molecular 
basis that underlies the dual functions of . 
EphA2 in oncogenesis. c 


Results 

Time-resolved fluorescence spectroscopy reveals 
preassembly of the ligand-free EphA2 into 
multimers in live cells 


To understand the molecular basis of canonical 
and noncanonical signaling by EphA2, we 
examined the spatiotemporal assemblies of 
EphA2 in live cells using a time-resolved fluo- 
rescence spectroscopy called PIE-FCCS. In 
PIE-FCCS, two pulsed lasers are overlapped 
in space but interleaved in time (Fig. 1A) (34, 35). 
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Fig. 1. Multimeric preassembly of ligand-free EphA2 detected by PIE-FCCS 
measurements. (A) Schematic of the two-color PIE-FCCS instrument. Two 
pulsed lasers are focused on the peripheral membrane of a live COS/ cell 
(inset, epifluorescent image) that expresed GFP- or mCherry-tagged membrane 
eceptors. Scale bar is 5 um. APDs, avalanche photondiode detectors; TCSPC, 
time-correlated single photon counting. (B) Diagram of multiplexed readout of 
molecular dynamics of receptors in cell membranes from PIE-FCCS measurements. 
(C and D) Raw data, including fluorescence fluctuation signals (C) and decay of 
fluorescence lifetime of GFP (D), that were collected during PIE-FCCS measure- 
ments. au, arbitrary units; F(t), fluorescence signal at time t. (E) Autocorrelation 
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(green and red) and cross-correlation (blue) functions of the fluorescence 
fluctuation signals, and three parameters obtained from the curves. Der, effective 
diffusion coefficient; G(x), normalized autocorrelation function of the fluorescence 
fluctuation; G,, cross-correlation function; G,, autocorrelation function of mCherry; 
Gg, autocorrelation function of GFP; tp, the lateral diffusion time within the confocal 
volume; , radius of the confocal volume. (F) Diagram of the fluorescence lifetime 
parameter, which indicates the C-terminal proximity within the protein oligomers. 
A shorter fluorescence lifetime of GFP is observed in oligomers with tight C-terminal 
assembly due to FRET. (G) Diagram of oligomerization control constructs. The 
monomeric control is a coexpression of fluorescent protein (FP, either GFP or mCherry), 


2 of 9 


RESEARCH | RESEARCH ARTICLE 


each fused separately to a c-Src membrane localization sequence (Myr-FP). The 
dimeric control has the leucine-zipper dimerization motif of GCN4 fused to 
GFP or mCherry and the c-Src membrane localization sequence (Myr-GCN4-FP). 
The multimeric control has the self-dimerizing kinase domain of EGFR introduced 
ngle-cell f, values for 

h the EphA2 data. 

(I) Single-cell f, values of ligand-free EphA2 in the plasma membranes of three 
ons from each cell line 


after the GCN4 motif (Myr-GCN4-EGFRk-FP). (H) Si 
each of the control constructs taken concurrently wit 


cell lines: COS7, DU145, and SCC728. The f, distributi 


Data are collected from single cells by focus- 
ing the lasers at the peripheral plasma mem- 
branes (Fig. 1A, inset), where receptors are 
totally diffusive. Photons emitted from re- 
ceptors, which are labeled with either green 
fluorescent protein (GFP) or red fluorescent 
protein (mCherry) at the C terminus, that dif- 
fuse through the laser focus are collected by 
two avalanche photodiode detectors and re- 
corded by a time-correlated single-photon 
counting module. The recorded photons are 
binned and time-gated to yield fluorescence 
fluctuation signals (Fig. 1C). The signals are then 
transformed into auto- and cross-correlation 
functions. From these correlation functions 
(Fig. IE), three parameters can be obtained (see 
methods for details): (i) the cross-correlation 
fraction (f,), which is related to the degree of 
oligomerization of the protein; (ii) the effective 
diffusion coefficient, which describes the mo- 
bility of the protein; and (iii) the number of 
particles, which indicates the density of pro- 
tein in the cell membrane. Because the oligo- 
merization state of the proteins has a direct 
effect on their mobility, reporting both param- 
eters (i) and (ii) provides a more complete view 
of membrane protein oligomerization in live 
cells. Control systems consisting of membrane- 
bound protein monomers, dimers, and multi- 
mers (Fig. 1, G and H; and fig. S1, A and B) 
were measured concurrently with EphA2 sam- 
ples to provide an jf. scale for quantification of 
the oligomerization state (36). 

We coexpressed EphA2 tagged with GFP or 
mCherry in African green monkey kidney 
fibroblast-like (COS-7) cells, as described in 
previous PIE-FCCS studies (37). Cells express- 
ing EphA2 ranging from 200 to 2000 receptors/ 
um? (fig. SIE) were chosen, which is compatible 
with the physiological amount of EphA2. In the 
absence of ligand binding, the /. values for 
EphA2 (Fig. 11) were significantly larger than 
those for the dimer control and were more 
comparable to those for the multimer control 
(Fig. 1H). Similar results were obtained in 
human prostate cancer cells (DU145) and a 
mouse cutaneous squamous cell carcinoma 
cell line (SCC728) that was established from an 
EphA2 and EphA1 double-knockout mouse. 
These observations indicate that ligand-free 
EphA2 self-assembled predominantly into 
multimers. To our knowledge, this is an un- 
usual property for an RTK in its ligand-free 


state. By contrast, the unliganded EGFR in 
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ns is not significant. 


the same PIE-FCCS assay is predominantly 
monomeric (37). 


Three distinct interfaces in the ectodomain 
mediate the multimeric assembly of EphA2 
Two symmetric HH interfaces, 

LBD-LBD and Sushi-Sushi, are required for 
ligand-free EphA2 multimeric assembly 


To elucidate the molecular basis of ligand-free 
EphA2 multimerization, we investigated inter- 
faces that were previously observed in the EphA2 
crystal structures (28, 29). We first investigated 
the symmetric HH Eph-Eph contact site, which 
is located at the C terminus of the LBD and 
involves residues D129 and G131 (D, Asp; G, Gly) 
(Fig. 2A) (28, 29). We mutated both residues by 
substitution with Asn (D129N) and Ser (G131S), 
respectively, and designated this mutation “LBD” 
(fig. S2A). Molecular dynamic simulations were 
carried out to verify that these point mutations 
did not destabilize the overall structure of the 
protein (fig. S2B and supplementary text). When 
measuring with PIE-FCCS, smaller f. values 
were obtained for the LBD mutant than for the 
wild-type (WT) receptor (Fig. 2B, left). This 
observation indicates that disruption of the 
LBD-LBD interface reduces the degree of EphA2 
multimerization, which is supported by an in- 
crease in receptor mobility (Fig. 2B, right) that 
can be ascribed to decreased friction within 
the membrane for smaller receptor assemblies 
relative to larger ones. This is evidence that the 
LBD interface participates in the multimeriza- 
tion of the ligand-free EphA2. 

We also examined the leucine zipper-like 
symmetric HH interface involving the Sushi 
domains of adjacent receptors (Fig. 2A) (28, 29). 
1223, L254, and V255 (L, Leu; V, Val) were 
mutated to arginine to disrupt this Sushi-Sushi 
interface (designated “Sushi”; fig. S2A). The 
median /. values decreased from 0.24 to 0.12 
(Fig. 2B, left), indicating decreased multimeri- 
zation, which was supported by an increase in 
mobility (Fig. 2B, right). Thus, similar to the 
LBD interface, the Sushi interface appears to 
contribute to multimerization of the ligand- 
free EphA2 receptor on the cell surface. 

The f. values for a mutant with mutations to 
both interfaces, designated “LS” (fig. S2A), were 
similar to those of the LBD or Sushi mutants 
alone (Fig. 2B, left). However, LS had f. values 
greater than the zero value of the monomeric 
controls (Figs. 1H and 2B, left), which might 
indicate that an additional interface or inter- 
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are similar to that of the multimer control, as indicated by the red horizontal 
bar. (J) Single-cell f, values of ephrin-Al in the plasma membranes of COS/7, 
which is close to zero, suggesting that ephrinAl is mostly monomeric. In (H) to 
(J), the boxes represent third quartile, median, and first quartile, and the 
whiskers indicate the 10th to 90th percentile. The total cell number that 

was used for each sample is reported at the top of the box plots. Data were 
analyzed by one-way analysis of variance (ANOVA) test; ****p < 0.0001, and 


faces contribute to the multimeric assembly 
of ligand-free EphA2. We tested a third HH 
interface located in the FN1 domain that was 
previously observed in the EphA2 ECD crystal 
structure (28). But disruption of this interface 
did not change f, values or mobility (fig. S2C). 


An LBD-FN2 interface mediates 
asymmetric HT assembly of EphA2 


Structures of the ligand-free EphA2 (29) and _ 
EphA4 (38) showed that an asymmetric Eph- 
Eph interface (Fig. 2C) formed between the 
LBD (head) and FN2 (tail) of adjacent ligand- 
free receptors in cis (on the same cell). This 
LBD-FN2 interface (Fig. 2C) in EphA2 (~980 A’) 
involves a salt bridge between D53 and R485, 
hydrogen bonding of Q56 with the main chain 
of V484, and hydrogen bonding of the main 
chain of N57 with N483 (R, Arg; Q, Gin) (Fig. 2C). 
We mutated N483L and R485E (E, Glu) to 
disrupt the LBD-FN2 interactions (designated 
“FN2”; fig. S2A). This mutation led to a de- 
crease in f, values from 0.24 to 0.15 (Fig. 2D, 
left) and increased the mobility of the recep- 
tors (Fig. 2D, right). Thus, the asymmetric HT 
LBD-FN2 interaction also appears to contrib- 
ute to the preassembly of ligand-free EphA2 
into multimers. 


Disrupting both HH and HT interfaces 
prevents EphA2 multimer formation 


We next disrupted all three interfaces (desig- 
nated “LSF”; fig. S2A). The f. values for the LSF 
mutation were close to zero, similar to that of 
the monomeric controls (Figs. 1, G and H, and 
2E, left), and mobility increased (Fig. 2E, right). 
This demonstrates that the LSF EphA2 mutant 
exists as a monomer. Although additional Eph- 
Eph interfaces with relatively minor contribu- 
tions cannot be ruled out, we conclude that the 
multimeric assembly of ligand-free EphA2 is 
primarily mediated by two HH contacts and 
the HT LBD-FN2 contact. 

A schematic consistent with the PIE-FCCS 
results is presented in Fig. 2I. We propose that 
ligand-free EphA2 may assemble into multimers 
through a “core” assembly of Eph molecules 
connected by symmetric HH interfaces (LBD- 
LBD and Sushi-Sushi) flanked by auxiliary 
arms formed by asymmetric HT (LBD-FN2) 
interactions. 

If so, the distance between the cytoplasmic 
tails of receptors that are involved in auxiliary 
HT assembly would be larger than those of 


3 of 9 


RESEARCH | RESEARCH ARTICLE 


Head-Head (HH) Interfaces 


NN 
LBD’-FN2 |" 


F HH Contact G HI Contact Both Contacts 
‘Disruption Disruption Disruption 
mEA1 48 56 6747 ™MEA1 48 54 = mEA1 48 42 
0.8} “en 0.8} 0.8} 
0.6 0.6} — 0.6F T 
ww Ww Bs — 
0.4} ist 0.4 “| 0.4} & 
0.21 z = 0.24 0.24 
oo, 00 i 
v RNG ROR 
@ 1.0, 1.0r kK @ 1.0; KEKE 
: = 
20.8} 0.87 = 0.8; 
5 5 
3 0.6 0.6+ 3 0.67 
€ 5 
8 0.4 0.4; 8 0.4+ 
5 5 
oO L a L 
3 0.2 0.2 i 2 0.2 Cl 
* 0.0 oo ll “0.0 
jo j& & 
x SS RY 2 


Fig. 2. Multimerization of ligand-free EphA2 is mediated by HH and HT 
interfaces. (A) Domain composition of the EphA2 receptor. The crystal structure 
of EphA2 ectodomain adapting HH contact through LDB-LBD and Sushi-Sushi 
interfaces is shown. Residues that mediate interactions are labeled. (B) f, values 
(left) and diffusion coefficients (right) of ligand-free EphA2 mutants that harbor a 
disruption at the HH interfaces. (C) Model of two ligand-free EphA2 molecules 
adapting HT contact through FN2 and LBD based on the crystal structure. 
Residues that mediate interactions are labeled. (D) f, values (left) and diffusion 
coefficients (right) of the ligand-free EphA2 mutant FN2, with disruption of the 
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(F to H) f, values (top) and diffusion coefficients (bottom) of mEA1-stimulated 
EphA2 mutants with disruption at the HH interfaces (F), at the HT contact (G), 
and at both contacts (H). (I) Schematic diagram of the molecular assemblies of 
WT EphA2 (black box) and mutants that have disruption at HT (blue box), HH 
(red box), and both contacts (purple box). This diagram does not represent the 
exact numbers of EphA2 molecules in the molecular assemblies. In (B) and (D) to 
(H), the apparent diffusion coefficients are summarized in bar graphs and report 
the mean and SEM values. In the box plots, boxes represent third quartile, 
median, and first quartile, and the whiskers indicate the 10th to 90th percentile. 
Data were analyzed by one-way ANOVA and two-tail t tests; ****p < 0.0001, 
***9 < 0.001, **p < 0.01, *p < 0.05, and ns is not significant. 


icantly smaller anisotropy values than the LS 
mutant (fig. S2E), despite the similar degree of 
oligomerization as determined from PIE-FCCS 


only the core assembly left displayed signif- 
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measurements (Fig. 2, B and D; and fig. S2D). 
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This observation indicates a closer proximity 
between the ICDs in the FN2 mutant than in the 
LS mutant (fig. S2F), providing further support 
for the model shown in Fig. 2I. ICDs of the 
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monomeric EphA2 LSF mutant were farthest 
apart and therefore displayed the highest ani- 
sotropy values (fig. S2E). Having a mixture of 
both contacts, WT EphA2 showed a medium 
anisotropy value (fig. S2E). 


Ephrin-Al ligand is monomeric on the cell surface 


Disparate results have been reported as to 
whether the ligand ephrin-A has to be arti- 
ficially dimerized to achieve full EphA2 activa- 
tion (39-41). The organization of ephrin-A on 
live cell membranes is yet to be determined. 
Full-length ephrin-Al with an N-terminal GFP 
or mCherry tag was used in PIE-FCCS measure- 
ments. A flexible linker (a five-repeat of GSG) 
was introduced between ephrin-A1 and the tag 
to minimize potential interference. Ephrin-Al 
displayed jf. values close to zero (Fig. 1J), similar 
to that of the monomer controls (Fig. 1H). EA1 
also showed high mobility comparable to that 
of the monomeric control (not shown). Thus 
ephrin-Al appears to exist as a monomer on 
the surface of live cells. 


Both monomeric and artificially dimerized 
ephrin-Al induce higher-order EphA2 
receptor clustering 


Upon stimulation of cells with ephrin-Al-Fc 
(EA1-Fc) that is dimerized by fusion to the 
heavy chain of human immunoglobulin G1 
(IgG), which causes robust activation of EphA2 
(12, 41), we detected increased f, values, far 
above those for the multimeric ligand-free 
receptor, in all cell lines tested (fig. S3A, top). 
These increased f, values, together with a sig- 
nificant decrease in receptor mobility (fig. S3A, 
bottom), indicated the formation of higher- 
order clusters of EphA2 receptors. Although the 
sizes of the clusters cannot be precisely deter- 
mined by PIE-FCCS, the results are consistent 
with EphA2 undergoing ligand-induced lateral 
condensation and aggregation into higher- 
order clusters that have been observed in the 
crystallographic studies (28, 29). 

Because both PIE-FCCS measurements (Fig. 1J) 
and previous cellular and biochemical char- 
acterizations demonstrate that ephrin-Al is a 
monomer under physiological conditions (40), 
we investigated the effects of ephrin-Al mono- 
mer (mEA1). Treatment of cells with mEA1 in- 
duced the large clusters of WT EphA2 toa 
similar degree as did dimeric EA1-Fc (WT in 
Fig. 2F and fig. S3B). Consistently, the stim- 
ulation of EAl-Fc and mEA1 ligands led to 
similar canonical signaling by EphA2, which 
was characterized by tyrosine phosphoryl- 
ation and suppression of ERK and AKT ac- 
tivities (fig. S3C). 


HH interfaces mediate 
ligand-induced EphA2 clustering, and HT 
interactions are outcompeted by ligand 


PIE-FCCS measurements showed that disrupt- 
ing each of the two HH interfaces individually 
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reduced the degree of higher-order cluster 
formation induced by mEA1, with more pro- 
nounced effects in the LBD mutant (Fig. 2F). 
mEAI stimulation of the LS mutant, which 
only retains the HT contact, caused a further 
reduction of the /. values to close to zero, in the 
range expected of a monomer (Fig. 2F). This is 
consistent with the relative affinities of the 
EphA2 LBD for ephrin-Al versus FN2 and 
the fact that the LBD-FN2 and LBD-ephrin-A1 
interfaces overlap (42) (fig. S3D). The LBD-FN2 
interaction could be outcompeted by the LBD- 
mEA1 interaction, leading to the formation of a 
ligand-receptor complex that moves in unison 
and thus displays the f. values of a monomer 
(Fig. 21). Indeed, this LBD-FN2 interface is not 
present in the crystal structures of 1:1 Eph- 
ephrin complexes (28, 29). 

Stimulation of FN2, which harbors mutations 
at the HT interface, with mEAI caused a sig- 
nificant but milder decrease in the degree of 
clustering as compared with stimulation of 
WT EphA2 (Fig. 2G). These HT contacts are 
thought to help bring more Eph receptors 
together before cell-cell contact-induced Eph- 
Ephrin engagement, which ensures fast and 
efficient ligand-induced clustering and con- 
densation (42). Thus, the disruption of the HT 
Eph-Eph contacts may decrease the availability 
of Eph receptors in the immediate proximity of 
the forming Eph-ephrin higher-order clusters. 

The LSF mutant with all three interfaces 
disrupted stayed as a monomer in the presence 
of mEA1 (Fig. 2H). None of the mutations 
affected the ligand binding between EphA2 
and ephrin-Al on the cell surface (fig. S3E), 
indicating that the effects are specifically as- 
sociated with Eph oligomerization status. 
Confirming the concerns that the artificially 
dimerized ligand may force nonphysiological 
receptor interactions, stimulation with dimeric 
EAI-Fc caused larger f values for both the LS 
and LSF mutants (fig. S3F) than those induced 
by mEA1 (Fig. 2, F to H). 

The f. values and diffusion coefficients 
obtained from PIE-FCCS measurements are 
summarized in table S1. On the basis of these 
results, an overall model depicting the mo- 
lecular organization of both ligand-free and 
-bound EphA2 is schematically summarized 
in Fig. 21. 


EphA2 receptor oligomerization regulates 
canonical and noncanonical signaling as well as 
constitutive recycling and endocytosis 

HH interfaces are necessary and sufficient 
for ligand-dependent canonical signaling 


Ligand stimulation induces Eph canonical sig- 
naling characterized by tyrosine phosphorylation 
on the di-tyrosine motif in the juxtamembrane 
(JM) segment (43). Expression of exogenous 
WT EphA2 in SCC728 cells showed strong tyro- 
sine phosphorylation after 15 min of ligand 
stimulation (WT and pY-EphA/B in Fig. 3A). 
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Longer treatment caused degradation of EphA2 
(WT and EphA2 in Fig. 3A). The FN2 mutant 
remained similarly responsive (FN2 in Fig. 3A), 
indicating that the HT LBD-FN2 interactions 
are dispensable for EphA2 tyrosine phosphor- 
ylation. By contrast, perturbations of the two HH 
interfaces (LS) largely blocked ligand-induced 
EphA2 tyrosine phosphorylation (LS in Fig. 3A). 
Similar results were observed in human embry- 
onic kidney cells (HEK293), human glioma stem 
cells (GSC827), human prostate cancer cells 
(PC-3), mouse squamous cell carcinoma cells 
(SCC748), and mouse glioma cells (1816) ex- 
pressing exogenous WT EphA2 (fig. S4, A to G). 
Perturbation of all three interfaces (LSF) caused 
a further reduction in canonical signaling in all 
cell types tested, most notably in HEK293 cells 
(fig. S4). Thus, HH interfaces appear to be indis- 
pensable for ligand-induced canonical signaling. 
Suppression of AKT and ERK activities were 
observed in WT- and FN2-expressing PC-3 
and 1816 cells (fig. S4, F and G) upon ligand 
stimulation, whereas LS- and LSF-expressing 
cells showed less or no effects. 


EphA2 multimerization through the ECD 
contributes to ICD juxtaposition and 
catalytic activation 


Generally, ligand-binding by RTKs induces 
conformational changes in the ECDs that are 
propagated to the ICDs, which leads to kinase 
domain apposition, catalytic activation, and 
tyrosine transphosphorylation (44, 45). We 
examined the fluorescence lifetime of GFP 
recorded during PIE-FCCS measurements. A 
decrease in GFP fluorescence lifetime indi- 
cates increased FRET efficiency between the 
C-terminal GFP and mCherry tags, indicating 
an increased proximity between the ICDs. We 
designed a panel of controls in which GFP and 
mCherry were kept either closer together or 
farther apart (fig. SIC). The GFP lifetime mea- 
surements revealed a close correlation of GFP 
lifetime with the distances in the engineered 
proteins (fig. SID). Ligand stimulation resulted 
in increased FRET efficiency in WT and FN2 
(Fig. 3B), consistent with decreased distance 
between kinase domains (Fig. 3C). This spa- 
tial rearrangement coincided with increased 
tyrosine phosphorylation (Fig. 3A). By con- 
trast, little change in FRET efficiency was ob- 
served in LS (Fig. 3B), indicating no change 
in kinase domain proximity (Fig. 3C), which 
is in keeping with a lack of tyrosine phos- 
phorylation (Fig. 3A). These results show that 
HH interfaces propagate ligand-induced con- 
formational change to the ICDs to cause cat- 
alytic activation, whereas HT contact does not 
(Fig. 3C). 


HT LBD-FN2 contact facilitates ligand- 
independent noncanonical signaling 


EphA2 noncanonical signaling correlates with 
$897 phosphorylation (13). Ligand stimulation 
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Fig. 3. HH and HT contacts modulate signaling and endocytosis of EphA2. 
(A) EphA2 constructs expressed in SCC728 cells were stimulated with EAl-Fc 
(3 ug/ml) for 15 and 60 min and were then lysed. Whole-cell lysates (WCLs) were 
subjected to immunoblotting with the indicated antibodies. Controls were the 
untreated cells marked as 0 min. (B) FRET efficiency of EphA2 constructs before 
and after treatment of cells with ligand. The total cell number that was used for 
each sample is reported at the top of the box plots. Boxes represent third 
quartile, median, and first quartile, and the whiskers indicate the 10th to 90th 


reduced phosphorylation on S897 of WT 
EphA2 (73), an effect that was retained in the 
FN2 mutant (pS897 in Fig. 3A and fig. S4, A to G). 
By contrast, pS897 remained high and un- 


ized in fig. S4H. 


ligand-free 


noncanonical signaling by EphA2 is summar- 


Effects of the EphA2 multimeric 


ephrin-A1 15 min 


percentile. Data were analyzed by two-tail t test; ****p < 0.0001, and ns is not 
significant. (©) Schematic diagram of signaling and changes in kinase proximity 
of WT, FN2, and LS EphA2. pS, phosphorylated Ser; pY, phosphorylated Tyr. 

(D) Confocal images of GSC827 cells expressing EphA2-GFP (green) and stained 
for Rab5 (magenta). The nucleus was stained with 4',6-diamidino-2-phenylindole 
(DAPI) (blue). Separated images of EphA2 constructs and Rab5 proteins are 

shown in inverted format. Merged images of the cells are shown in color. White 
features indicate the colocalization of EphA2 and Rabd. All scale bars are 5 um. 


of EphA2 on the cytoplasmic membrane and 
decreased the presence of cytosolic, punctate 
EphA2 (Fig. 3D, left; and fig. S5, A and B). Thus, 
multimeric assembly of EphA2 appears to be 


changed upon sustained ligand exposure of 
cells expressing the LS mutant (Fig. 3A and 
fig. S4, A to G). Thus, although HH interfaces 
appeared to attenuate noncanonical signal- 
ing, HT LBD-FN2 contacts facilitated it. A 
possible explanation is that the spatially sep- 
arated kinase domains resulting from LBD- 
FN2 contact, as shown with anisotropy and 
GFP lifetime analysis, facilitate the interac- 
tion of S897 with AKT and RSK. The effects 
of HH and HT interruptions on canonical and 
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preassembly on ligand-independent 
constitutive recycling 


Ligand-independent autorecycling is important 
in regulating basal EphA2 activity (32). Without 
exposure to ligand, fluorescence images of 
HEK293 cells and GSC827 glioma stem cells 
showed a clear punctate cytosolic population 
of WT EphA2 in addition to the plasma mem- 
brane population (Fig. 3D, left; and fig. S5, A 
and B). Mutations perturbing any of the three 
Eph-Eph interfaces increased the accumulation 
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required for autorecycling. 


Effects of Eph oligomerization on ligand- 
induced endocytosis 


Endocytosis has an essential role in signaling 
by RTKs (46), including the Eph protein kinases 
(47). In agreement with earlier reports (37, 48), 
there was low but detectable WT EphA2 in early 
endosomes, marked by Ras-related protein 
Rabda (Rab5), in the absence of ligand (Fig. 3D, 
left), which was increased after 15 min of 
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ephrin-Al ligand exposure (Fig. 3D, right). After 
60 min of treatment, most WT EphA2 entered 
into late endosomes marked by Ras-related 
protein Rab7a (Rab7; fig. S5C, right). The FN2 
mutant had little colocalization with Rab5d in 
the absence of ligand (Fig. 3D, left), consis- 
tent with the lack of autorecycling. However, 
clear endocytosis into Rab5 and Rab7 endosomes 
was seen in cells exposed to ephrin-Al (Fig. 3D 
and fig. S5C, right). In sharp contrast, LS mutant 
receptor was largely refractory to ligand-induced 
endocytosis, as indicated by a mostly plasma 
membrane population and no colocalization 
with Rab5 or Rab7 (Fig. 3D and fig. S5C, right). 

Together with the biochemical analysis, these 
results demonstrate that the core assembly me- 
diated by the HH interfaces is necessary and 
sufficient for ligand-induced canonical sig- 
naling and endocytosis of EphA2. The aux- 
iliary HT contact correlates with noncanonical 
signaling through pS897 and does not pro- 
mote ligand-induced canonical signaling and 
endocytosis. 
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EphA2 oligomerization regulates cell rounding 
and migration 

Regulation of cell morphology and migration 
is among the most characterized functions of 
Eph receptors. Stimulation of EphA2 with 
ephrin-Al and the ensuing canonical signaling 
in vitro leads to cell rounding and inhibition of 
cell migration (12-14). Time-lapse imaging 
showed that treatment of HEK293 cells ex- 
pressing WT-EphA2 with ephrin-Al led to 
rapid cell rounding (Fig. 4A and movie S1). 
FN2-expressing cells remained responsive to 
cell rounding upon ligand stimulation (movie 
$2), whereas cells expressing LS mutant receptor 
became refractory (movie S3), consistent with 
the resistance of the receptors to canonical sig- 
naling. PC3 human prostate cancer cells express 
large amounts of endogenous EphA2 and were 
the first cell type reported to round up in re- 
sponse to ligand stimulation (12). PC3 cells with 
CRISPR-CAS9 knockout of EphA2 (fig. S6A) 
completely lost their cell rounding response 
(fig. S6B). Moreover, reconstitution of WT or 
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FN2 mutant receptor, but not the LS mutant 
receptor, restored the responsiveness (fig. S6B). 
These data demonstrate that HH contact me- 
diates ligand-induced cell rounding through 
canonical EphA2 signaling, whereas HT contact 
does not contribute to cell rounding. 

In a trans-well migration assay, we found 
that overexpression of LS mutant receptor in 
HEK293 cells strongly promoted chemotactic 
cell migration compared with expression of WT 
or FN2 mutant receptor (fig. S6C). We per- 
formed a wound-healing assay with a cutaneous 
squamous carcinoma cell line (283LM) derived 
from an EFNAI, EFNA3, and EFNA4 ligand gene 
triple-knockout (TKO) mouse (J5, 17). Because 
the wound-healing is performed with freshly 
confluent cells, the interactions of EphA2 with 
endogenous ephrin-A ligands on neighboring 
cells can complicate data interpretation. Use 
of cells from the TKO mice greatly mitigated 
this concern. Endogenous EphA2 was depleted 
from 283LM cells by CRISPR-CAS9, and WT or 
mutant exogenous EphA2 was reintroduced 
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Fig. 4. HH and HT contacts of EphA2 modulate cell migration in vitro 

and invasion in vivo. (A) Sample phase-contrast images of HEK293 cells 
expressing WT or mutant EphA2 at O (-) and 20 (+) min after ligand stimulation. 
Zero time points represent untreated controls. Note that cell rounding occurred in 
WT- and FN2-expressing cells (highlighted with red boxes) but not in LS- 
expressing cells. A total of three independent experiments were performed. Scale 
bars are 5 wm. (B) Scratch-wound assay using EphA2 knockout 283LM cells 
restored with the WT or mutant receptors. Sample phase-contrast images at 

0 hours (top) and 16 hours (bottom) are shown. The yellow masks define the 
area covered with cells. The red lines demarcate the starting margins of the 
wound areas. (©) Wound confluency at 16 hours. The wound confluency is 
summarized in a bar graph to report the mean value and SEM. A total of 12 
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wounds were used for each group, and a total of three independent experiments 
were performed. Data were analyzed by one-way ANOVA test; ****p < 0.0001, 
**p < 0.01, and ns is not significant. (D) Kaplan-Meier survival curve (top) 

of mice injected intracranially with 1816 cells expressing WT EphA2 or the 
indicated mutant EphA2. A table showing the number, sex, and median 
survival of mice is shown at the bottom. (E) Representative whole-mount brain 
images are shown at the top. Arrows point to regions of hemorrhage. The 
numbers of mice with brain hemorrhage and its hemispheric distribution, on 
the basis of gross examination of the whole brain, are shown at the bottom. 
(F) Histology analysis of mouse brains. Low-power views of the brain are 
shown on the left; corresponding magnified views of the indicated regions are 
shown on the right. 
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(fig. S6D). Reintroduction of WT EphA2 pro- 
moted cell migration, whereas FN2 mutant 
decreased it (Fig. 4, B and C). Only HT LBD-FN2 
interactions are retained in the LS mutant (Fig. 
21, and it showed the strongest effects on 
promoting 283LM cell migration (Fig. 4, B and C; 
and fig. S6E), consistent with the enhanced 
migration of HEK293 cells expressing the same 
mutant (fig. S6C). Together these results sug- 
gest that the HT contact facilitates promigratory 
signaling, possibly through the elevated pS897 
noncanonical signaling. The lack of HT contact 
in FN2 correlates with a reduction in migratory 
behavior, whereas WT EphA2, with mixture of 
HT and HH contacts, ranked in between LS 
and FN2 (fig. S6E). 


Asymmetric HT contact correlates with 
worse host survival in a syngeneic murine 
glioma model 


EphA2 is an oncogenic driver in gliomagenesis, 
in part by promoting diffuse infiltrative inva- 
sion (7, 49), a major cause of poor prognosis of 
the disease. A search of The Cancer Genome 
Atlas (TCGA) database revealed that EphA2 is 
poorly expressed in the normal brain but is 
abundantly up-regulated in glioblastoma (GBM), 
particularly in the mesenchymal and classical 
molecular subtypes (fig. S7A), and the over- 
expression is correlated with poor overall sur- 
vival (49). Using the murine GBM cell line 


(1816), which lacks expression of Nf1 and Tp53 
(50-52) and shares the molecular signature of 
human mesenchymal GBM, we examined the 
roles of EphA2 oligomerization in gliomagenesis 
after intracranial implantation of cells into syn- 
geneic C57B1/6 mice. To this end, WT EphA2, 
the LS mutant missing HH contacts, and the 
FN2 mutant lacking HT interaction were over- 
expressed in 1816 cells through retroviral vector- 
mediated gene transduction. Cells expressing 
similar levels of the exogenous receptors were 
injected intracranially, as reported previously 
(77). The survival of recipient mice was moni- 
tored using a Kaplan-Meier plot. Compared 
with mice receiving parental cells, mice implanted 
with 1816 cells overexpressing WT EphA2 showed 
reduced survival (Fig. 4D), consistent with earlier 
reports (17, 49). Notably, mice injected with cells 
overexpressing LS receptor showed worse sur- 
vival relative to mice injected with WT EphA2- 
overexpressing cells. This observation correlates 
with the promigratory behavior of the LS- 
expressing cells in vitro (Fig. 4C). By contrast, 
mice implanted with 1816 cells expressing the 
FN2 mutant showed improved survival (Fig. 4D), 
which is in keeping with the reduced migration 
of these cells in vitro compared with WT EphA2- 
expressing cells (Fig. 4C). 

Gross examination of the whole brain revealed 
that mice that received LS-expressing cells 
showed a high frequency of tumor cells invad- 


A Ligand-free EphA2 
Non-canonical signaling, oncogenic 


Core LBD-LBD 
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ing across the midline to the other hemisphere 
of the brain (10 of 14 mice), which was ac- 
companied by the uniform presence of hemor- 
rhage (Fig. 4E and fig. S7B). Invasion across 
the midline occurred at much lower frequencies 
for the tumors with WT- and FN2-expressing 
cells, and hemorrhage was milder and present 
less often. Histological analysis confirmed exten- 
sive spreading of the tumor cells expressing LS 
to the other hemisphere with hemorrhage at the 
periphery of the tumor mass, whereas the 
tumors expressing WT or FN2 mutant EphA2 
were often restricted to one side of the brain 
(the site of tumor cell implantation) with less 
prominent hemorrhaging (Fig. 4F). Thus, multi- 
meric assembly of EphA2 appears to influence 
malignant invasive behaviors in vivo. Disrupt- 
ing the core of the Eph oligomeric assemblies 
promoted invasion and reduced host survival 
in vivo by ablating canonical and promoting 
noncanonical signaling, whereas disrupting 
auxiliary HT contact improved host survival by 
attenuating noncanonical signaling. 


Discussion 


We report that in the absence of ligand bind- 
ing, EphA2 receptors are assembled into multi- 
mers through symmetric HH (LBD-LBD and 
Sushi-Sushi) and asymmetric HT (LBD-FN2) 
interfaces (Fig. 5A). The direct measurement 
of these reported interfaces of EphA2 in their 
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Fig. 5. Schematic depictions of the molecular assembly of EphA2 on the 
cell surface. (A) Multimeric assembly EphA2 in the absence of ligands. 

(B) Ligand-induced conformational changes of EphA2, including 71° rotation 
of the FN2 domain relative to the rest of the Eph ECD. (C) Rearrangement of 
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the kinase domains into close proximity for transphosphorylation on tyrosine 
residues. (D) Lateral condensation into large EphA2-ephrin higher-order 
clusters accompanied by activation of canonical signaling and suppression 
of noncanonical signaling.. See text for more details. 
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native environment provides an unusual exam- 
ple of multimeric assembly of a RTK in the 
absence of ligands. Engagement through sym- 
metric interfaces forms the core of the Eph 
multimer, and the interaction through asym- 
metric interfaces extends multimerization 
through auxiliary assembly on the flanks. 
Functionally, this assembly keeps kinase 
domains in the ICD apart to facilitate the ligand- 
independent noncanonical signaling through 
AKT-, RSK1-, and PKA-mediated phosphoryla- 
tion at S897. Upon ligand stimulation, the asym- 
metric FN2-LBD interactions are displaced by 
high-affinity ligand-receptor (ephrin-LBD) inter- 
actions (Fig. 5B). Seiradake et al. reported a 71° 
rotation of the FN2 domain relative to the rest 
of the Eph ECD (LBD-CRD-FN1), which is struc- 
turally rigid, upon ligand binding (28). This 
reorientation at the hinge-like FN1-FN2 linker 
would facilitate the recruitment of additional 
receptors into the EphA2-ephrin clusters (Fig. 5B). 
The conformational changes in the ECD are pro- 
pagated to the ICD to induce rearrangement 
of the kinase domains into close proximity for 
transphosphorylation on tyrosine residues. These 
clusters then undergo lateral condensation into 
large EphA2-ephrin higher-order clusters (Fig. 
5C) to achieve ligand-dependent canonical 
signaling, whereas the noncanonical signaling 
through phosphorylation of $897 is attenuated 
(Fig. 5D). It has been suggested that the mechan- 
ical force at the Eph-ephrin junction plays a role 
in the formation of higher-order clusters (30, 53). 

The molecular assembly of EphA2 has been 
examined by various FRET assays, leading to 
reports of either monomers or dimers (31, 32). 
FRET measurements depend on the proximity 
of the fluorescent tags on EphA2, and the 
typical Forster radius for fluorescent proteins 
is around 60 A (54). However, the length of the 
rigid ECD of EphA2 is around 146 A (29), well 
beyond the Forster radius. Because the EphA2 
receptors are connected by HT (LBD-FN2) in- 
teractions (auxiliary in Fig. 5A), the long dis- 
tance between the ICDs is not expected to be 
detected in the FRET assays. Unlike FRET, 
PIE-FCCS measures the diffusion of tagged re- 
ceptors to quantify their oligomerization state 
and is thus compatible with more spatially dis- 
tant fluorescent tags within the same molecular 
assembly, such as those in the auxiliary arms. 
The changes in oligomerization are also corrob- 
orated by changes in the mobility of the 
EphA2 assemblies, which are measured directly 
with PIE-FCCS. In addition to these diffusion- 
based readouts, PIE-FCCS measures the fluo- 
rescence lifetime, which provides information 
on the Eph spatial arrangement within the oligo- 
mers. With each of these interconnected pieces 
of information (summarized in table S1), PIE- 
FCCS provides an improved characterization 
of the contribution of the ECD domains to 
the functional EphA2 assembly in live cell 
membranes. 
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The multiple interactions of ligand-free 
EphA2 multimers are unusual for RTK cell- 
surface organization (Fig. 5A). Previous studies 
that used the same PIE-FCCS technology showed 
that ligand-free EGFR is present predominantly 
as a monomer on the cell surface (37). Dimeri- 
zation of ligand-free receptors has been observed 
in several RTKs through various molecular 
mechanisms. For example, the transmembrane 
region mediates the ligand-independent dimeri- 
zation of discoidin domain receptor tyrosine 
kinase 1 and 2 (DDR1 and DDR2) (55), and the 
insulin receptor and the closely related insulin 
like growth factor 1 (IGF1) receptor are both 
expressed on the cell surface as preexisting 
disulfide-linked dimers (44). 

Finally, the multimeric assemblies of ligand- 
free EphA2 have pathological and therapeutic 
implications. EphA2 is overexpressed in many 
solid human tumors, which is often accompa- 
nied by simultaneous loss of ligand expression 
(4-8), creating conditions that promote multi- 
meric assembly of ligand-free EphA2 and onco- 
genic signaling through S897 phosphorylation. 
Because the EphA2 ECD plays a dominant role 
in receptor multimerization, its accessibility 
may make it amenable to therapeutic interven- 
tions. The LBD-FN2 asymmetric EphA2 inter- 
face may also be a good target for therapeutic 
development. By disrupting asymmetric Eph- 
Eph interactions, pro-oncogenic noncanonical 
unliganded-EphA2 signaling could be attenu- 
ated, which might be exploited alone or in 
conjunction with other agents to suppress 
malignant tumors. 
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CLIMATE SCIENCE 


State dependence of CO, forcing and its implications 


for climate sensitivity 


Haozhe He’*+, Ryan J. Kramer”*+, Brian J. Soden’, Nadir Jeevanjee* 


When evaluating the effect of carbon dioxide (CO2) changes on Earth's climate, it is widely assumed 
that instantaneous radiative forcing from a doubling of a given CO2 concentration (IRF2,co2) is 
constant and that variances in climate sensitivity arise from differences in radiative feedbacks or 
dependence of these feedbacks on the climatological base state. Here, we show that the IRF2xco2 is 
not constant, but rather depends on the climatological base state, increasing by about 25% for every 
doubling of CO2, and has increased by about 10% since the preindustrial era primarily due to the 
cooling within the upper stratosphere, implying a proportionate increase in climate sensitivity. This base- 
state dependence also explains about half of the intermodel spread in IRF2xco2, a problem that has 
persisted among climate models for nearly three decades. 


adiative forcing (RF) refers to a change in 

net radiative flux at the top-of-atmosphere 

(TOA) due to an externally imposed per- 

turbation in the Earth’s energy balance 

C, 2), such as anthropogenic activities 
(e.g., emission of greenhouse gases and aero- 
sols) or natural events (e.g., volcanic erup- 
tions). The Earth subsequently warms or cools 
to counteract the flux perturbation and restore 
radiative equilibrium. The RF is commonly 
separated into two parts (/, 3-6): (i) instan- 
taneous radiative forcing (IRF), which measures 
the change in net radiative flux that results 
only from the change in forcing agents, and 
Gi) rapid adjustments, which consist of radia- 
tive perturbations induced by atmospheric re- 
sponses to the IRF independent of any change 
in surface temperature. This study focuses on 
the IRF, which is considered to be the best- 
understood aspect of RF (7). For CO perturba- 
tions, the IRF is responsible for approximately 
two-thirds of the total RF and is the fundamen- 
tal driver of the rapid adjustments (J, 3-6, 8-12), 
wherein stratospheric cooling is the dominant 
adjustment to CO, forcing (17, 12). However, 
several previous studies have shown that the 
IRF from a doubling of CO, concentration 
CIRF»,.co2) varies by ~50% among climate mod- 
els (10, 13-15). Although this spread has per- 
sisted for nearly three decades, its underlying 
cause has never been fully resolved. 

Climate sensitivity is formally defined as the 
change in the global mean surface tempera- 
ture required to restore radiative equilibrium 
in response to a doubling of CO. concentra- 
tion (AT75,.co2). It is the most widely used met- 
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ric to quantify the susceptibility of the climate 
to an externally forced change, i.e., ATo,.co2 = 
-RF»,.co2/A, where the radiative damping (A, 
which is expressed in watts per meter squared 
per degree kelvin) is the efficiency at which 
radiative equilibrium is restored per unit change 
in surface temperature. The radiative damping 
depends on a number of well- and not-so-well- 
understood feedbacks within the climate sys- 
tem and is widely recognized to vary between 
climate models and in time as the climato- 
logical base state evolves. However, the inter- 
model variance in the RF5,.co2 and its dependence 
on the base state are less well recognized. In 
this study, we demonstrate that the IRF4,.co2 
is not a constant, but rather depends on the 
climatological base state, as suggested by a 
recent analytical model (J6). This state depen- 
dence not only explains about half of the 
intermodel variance in IRF,,.co2, but it also 
fundamentally reshapes our understanding of 
climate sensitivity, with important implica- 
tions for both past and future climate changes. 


Results 


The Coupled Model Intercomparison Projects 
(CMIP) provide a series of coordinated experi- 
ments performed in support of the Intergov- 
ernmental Panel on Climate Change (IPCC) 
assessments in which model simulations are 
achieved using identical emission scenarios 
(17, 18). However, because determining the 
IRF requires additional calculations, it is not 
routinely computed for most experiments. In 
the first comprehensive RF comparison among 
climate models, Cess et al. (13) found that the 
IRF»,.cos ranged from ~3.3 to 4.7 W m~. Sub- 
sequent studies with newer generations of 
models found a similar range (J0, 14). This 
spread was thought to mainly arise from inter- 
model differences in the parameterization of 
infrared absorption by CO, (/5). 

Double-call radiative transfer calculations 
are the most direct method for diagnosing the 
IRF in model simulations. To produce these 
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specialized online diagnostics, a second 4 ce 
made to the radiation scheme at each time .----— 
Radiative fluxes are recalculated with a hypo- 
thetical forcing agent perturbation, such as CO, 
at some increased concentration. These pertur- 
bations are solely used to diagnose the IRF and 
do not interact with the model simulation. Al- 
though only a few online double-call calculations 
were performed by climate models from CMIP5/6, 
the available output is particularly useful for 
investigating the state dependence of CO, IRF. 
To avoid the complicating effects of clouds in 
masking the IRF (7, 19, 20), we further simplified 
our analysis by limiting it to infrared CO, forcing 
at the TOA under clear-sky conditions. 

Figure 1A shows the online double-call cal- 
culations available from the CMIP5/6 models 
for the Atmospheric Model Intercomparison 
Project (AMIP) historical experiment (amip), 
which contains the most online double-call cal- 
culations of any of the CMIP experiments (12 of 
80 participating models provided calculations 
for this experiment; tables S1 and S2). The amip 
experiment consists of atmosphere-only model 
simulations that all used identical, time-varying 
sea surface temperatures observed over the pe- 
riod 1979 to 2008 as boundary conditions. The ‘ 
online double calls provided are for 4xCO2; note 
that IRFy..co2 * 2 x IRF,.co2 for a given climate 
state (see the materials and methods). The re- 
sults exhibit a large intermodel spread (ranging * 
from ~4 to 8 W m~), consistent with that ob- 
served in previous model generations (15). 

To investigate the extent to which differ- 
ences in the thermal structure of the climato- 
logical base state can explain the intermodel 
spread of IRF, we performed offline double- 
call IRF,,.co2 calculations using original atmo- 
spheric profiles from the AMIP models and a 
single radiative transfer model (SOCRATES; 
see the materials and methods). In contrast to « 
the online counterparts, the same radiative ‘ 
transfer parameterization is used in all of the 
offline calculations, so their intermodel spread 
is only due to differences in the climatological 
base states. The strong correlation (r = 0.82) 
between the IRFs from the online and offline 
double-call calculations (Fig. 1B) suggests that 
more than half of the intermodel variance in 
IRF4,.co2 results from differences in climato- 
logical base states, not from differences in re- 
presenting the spectral absorption of CO,. This 
is consistent with a recent study by Pincus et al. 
(19), who computed IRF from different radiative 
transfer schemes but using the same climatolog- 
ical base state, finding a much smaller spread in 
IRF4,.co2 than in the online double calls (Fig. 
1A). Together, these studies provide compelling 
evidence to suggest that intermodel differences 
in the climatological base state are an essential 
contributor to the spread in CO, IRF. 

The influence of the base state on CO, IRF 
is more clearly illustrated in the coupled model 
simulations from CMIP6, in which a 1% per year 
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Fig. 1. Intermodel spread in IRF4.coz2 and its 
causes. (A) Time series of all available online double- 
call IRF4.co2 with base state from amip experiments 
for CMIP5/6 models. The black vertical reference 
line highlights the IRF4.co2 values used in (B), and 
the gray line accentuates the brief declines in the 
magnitude of the IRF4.co2 in the year 1992, after the 
eruption of Mount Pinatubo. (B) Comparison of the 
IRF4xco2 in the year 2000 from the online and 
offline double-call calculations. The gray filled circles 
epresent models from CMIP6, and the open circles 
with a cross inside show models from CMIP5. The red 
filled circle with a cross inside highlights the outlier 
model (i.e., CanAM4). Because the vertical IRF profile 
of CanAM4 shows an increase with height within the 
stratosphere [see figure 3 of Chung and Soden (10)], 
it differs from the common expectation based on 
the negative lapse rate within the stratosphere. It is 
easonable to exclude the results of the CanAM4 
from the spread contribution analysis. The values in 
front of (in) parentheses shown in (B) are values 
calculated without (with) the outlier model CanAM4. 
(C) A scatterplot of global and annual mean air 
temperature at 10 hPa of each model in the year 2000 
of the amip experiment versus its corresponding 
offline double-call IRF4.co2. 


increase is imposed on the atmospheric CO, 
concentration (IpctCO2; Fig. 2). Although only 
two models (Fig. 2A, solid lines) submitted on- 
line double-call calculations, the results reveal 
a substantial growth in IRF,,.co2 as the cli- 
matological base state evolves. For both mod- 
els, IRF4co2 increases from ~5 Wm? when 
IRF,4,.co2 is computed in a preindustrial cli- 
mate to ~8 W m ” when it is computed in an 
elevated-CO, climate. This challenges the wide- 
ly held assumption that the IRF5,.co2 is con- 
stant (27-23); on the contrary, it demonstrates 
that the CO, IRF is a dynamic quantity that 
changes substantially as the climate changes. 
To verify this result, we performed a series 
of line-by-line and SOCRATES offline double- 
call calculations using the full suite of CMIP5/6 
coupled simulations under the IpctCO2 sce- 
nario (Fig. 2A, markers). These results both 
confirm the marked increase in IRF4,.co2 using 
a much larger ensemble of models and, be- 
cause the same radiative transfer scheme is 
used for all offline calculations, indicate that 
changes in the climatological base state are 
responsible for this increase. Note that the cli- 
matological base state here includes the ther- 
mal structure as well as the base-state CO. 
concentration (24-26), both of which vary with 
each time step. However, most of the IRF4,.co2 
increases are due to the evolution of thermal 
structure, especially for the first doubling of 
base-state CO, concentration (fig. S1). 
According to the analytical model of 
Jeevanjee et al. (16), the dependence of CO, 
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IRF on the climatological base state can be 
understood in terms of dependence on the 
emission temperature of both stratosphere and 
troposphere as follows: 


F =2n (£) [nB(vo, Tem) — ®B(Vo. Tauat)] 


1 


Where / is the “spectroscopic decay” param- 
eter of 10.2 cm”, q; is the initial CO. concen- 
tration, qg; is the final CO, concentration, and 
TB(Vo,T'em) and nB(Vvo,T'strat) are the hemi- 
spherically integrated Planck function at peak 
absorption wave number of CO, with the 
tropospheric and the stratospheric emission 
temperature, respectively (see the materials 
and methods). The latter refers to the tem- 
perature of the upper stratosphere, where unit 
optical depth is achieved by the peak of the CO, 
absorption band, whereas the former depends 
on surface temperature and free troposphere 
relative humidity. This model has been used to 
help explain the spatially inhomogeneous dis- 
tribution of IRF that results from a spatially 
uniform increase of CO, (27). 

As CO, increases in the IpctCO2 simulations, 
the surface temperature warms and the strato- 
sphere cools roughly linearly over time (Fig. 2, 
B and C). To assess the relative contributions 
of these changes in climate to the increase in 
IRF,,.co2, we include results from the CMIP6 
abrupt-4xCO2 experiment (Fig. 2, dashed lines; 
only one model provided online double-call 
calculations for this experiment). In contrast 
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to the IpctCO2 experiment, COs is instantly ‘ 


quadrupled in the abrupt-4xCO2 experiment, 
causing the surface to warm rapidly over the 
first few decades before leveling off. The strato- 
sphere adjusts even more rapidly, equilibrat- 
ing to a new temperature within the first year. 

The contrasting temporal evolution of the 
climate between these two scenarios is reflec- 
ted in the IRF,,.co2. For instance, the IRF4,.co2 
with abrupt-4xCO2 base-state exhibits only a 
mild increase with global mean surface warm- 
ing (Fig. 2), indicating a relatively weak depen- 
dence of the CO, IRF on surface temperature. 
By contrast, IRF4,.co2 in the IpctCO2 exper- 
iment exhibits a much larger increase over 
time despite having a similar change in global 
mean surface temperature. Physically, the CO, 
IRF represents a swap of tropospheric emission 
for stratospheric emission (6), and because 
the temperature change within the strato- 
sphere is much larger than that at the surface 
and within the troposphere, the IRF increase 
closely follows the stratosphere cooling, sug- 
gesting a dominant role of stratospheric tem- 
perature on the CO, IRF. We emphasize that 
the results shown in Fig. 2A represent IRF 
only and do not include the stratospheric ad- 
justment. Rather, the changes in IRF over time 
reflect the impact of the stratospheric adjust- 
ment from prior COz changes on the base 
state, which in turn amplifies the IRF that would 
result from a subsequent “hypothetical” quad- 
rupling of COs. Because cloud masking has 
virtually no influence on stratospheric emission, 
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Fig. 2. CO2 IRF increases as the surface warms and the stratosphere cools. (A to C) Time series of 
global and annual mean online double-call IRF4.co2 (A), surface temperature (B), and air temperature 

at 10 hPa (C) from models CNRM-CM6-1 and IPSL-CM6A-LR. Three highlighted time slices in (A) are years 1 
to 10, 66 to 75, and 131 to 140. Overlaid gray triangles represent the global and time mean SOCRATES 
offline double-call IRF4xcoz with corresponding atmospheric profiles of lpctCO2 simulations from CMIP5/6 
models. The black plus symbols show the global mean ARTS offline double-call IRF4.co2 with time mean 
atmospheric profiles from the CMIP6 model, which has the median SOCRATES double-call IRF4co2 value. 
Similar results from another line-by-line model (PyRADs) are shown in fig. Sl. Note that the results in 

(A) represent IRF only and do not include any rapid adjustment. Rather, the changes in IRF over time reflect 
the impact of the effects from prior CO2 changes on the base state, which in turn amplifies the IRF that 
would result from a subsequent “hypothetical” quadrupling of COz. 


the dominant role of stratospheric temper- 
ature also remains under all-sky conditions. 
The state dependence of CO, IRF on the 
surface temperature and stratospheric tem- 
perature is also evident in the amip simula- 
tions (Fig. 1A). Because these simulations adopt 
the same sea surface temperature as their 
boundary conditions, our results imply that 
differences in stratospheric temperature are 
primarily responsible for the intermodel spread 
in IRF4,.co2. To confirm the role of the strato- 
spheric temperature on the IRF spread, we 
also performed the SOCRATES offline double- 
call IRF calculations using the same amip 
base states and determined its correlation with 
the corresponding air temperature at 10 hPa, 
which is the highest level of CMIP5 standard 
pressure-level outputs [and is closest to the 
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level with unit optical depth achieved by the 
peak of the CO. absorption band (/6, 20, 28)]. 
A high, significant correlation was found be- 
tween the IRF and stratospheric temperature 
across both the CMIP6 and CMIP5 models 
(Fig. 1C and fig. $2), highlighting that biases 
in stratospheric temperature play a domi- 
nant role in causing the intermodel spread 
in CO, IRF. 

The overwhelming effect of stratospheric 
temperature over surface temperature is also 
reflected in the brief declines for many models 
in the magnitude of the IRF4,.co2 in the year 
1992, after the eruption of Mount Pinatubo 
(Fig. 1A). On average and across the models, 
there was only a 0.2 K surface temperature 
decrease but an ~1 K temperature increase at 
10 hPa in 1992 compared with 1991. 
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The analytical model of CO, IRF by 
Jeevanjee et al. (16) replicates the offline double- 
call IRF4,.co2 of CMIP6 and CMIP5 with high 
correlations for abrupt-4«CO2 simulations (Fig. 
3A and fig. $3), providing a computationally effi- 
cient alternative for investigating the sensitiv- 
ity of the CO, IRF to stratospheric temperatures. 
Because the 10 hPa temperatures cool at a 
similar rate for all models under 1IpctCO2 sce- 
narios from CMIP6 and CMIP5 (Fig. 3B and 
fig. S4), the temperatures at this level have 
nearly identical intermodel spread at the begin- 
ning and end of the simulations. This suggests 
that intermodel spread in the CO, IRF arises 
explicitly from differences in the initial strato- 
spheric temperatures under preindustrial con- 
ditions. We confirmed this with the analytical 
model, finding that it produces the same IRF 
intermodel spread and is highly correlated with 
the offline double-call calculations even when | 
the initial, preindustrial upper stratospheric 
temperatures are used as input for every time 
step instead of the actual, time-varying tem- 
perature from the corresponding abrupt-4«xCO2 
simulations (Fig. 3C and fig. S5). 

Our results demonstrate that CO, IRF in- 
creases as the climate changes in response to 
increased CO,. Online and offline double-call 
calculations from the CMIP6 historical simu- 
lations (Fig. 4A, fig. S6A, and table S3) indicate 
that IRF4,.co2 is ~10% larger today than it was 
in the mid-19th century due to the change in 
base state, which was primarily from strato- 
spheric cooling. This amplification arises pre- 
dominantly from the increase in well-mixed 
greenhouse gases over this period (Fig. 4A). 
Thus, the IRF,,.coz2 increases over time be- 
cause the CO,-induced cooling of the strato- 
sphere makes any subsequent change in CO, 
more potent. 

Because it is the sum of the IRF and rapid 
adjustments, known as the total or “effec- 
tive” RF, that ultimately drives climate change 
C, 3, 4, 29), it is important to understand the 
extent to which the rapid adjustments may 
also depend on the base state. To investigate 
the state dependence of the adjustments, we 
used atmosphere-only model simulations 
forced by boundary conditions of both the 
preindustrial era (piClim-control) and re- 
cent decades (amip), along with their corre- 
sponding 4xCO2 counterparts (piClim-4xCO2 
and amip-4xCO2; see the materials and meth- 
ods and table S4). The amip simulation not 
only has a higher prescribed CO, concentration 
than that of the piClim-control simulation, 
but also has a cooler stratosphere temper- 
ature, allowing us to quantify the magnitude 
of the adjustments under two different base 
states. 

The stratospheric adjustment is the most 
important of the rapid adjustments to CO, 
forcing, typically an order of magnitude larger 
than tropospheric adjustments (J/, 12). The sum 
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Fig. 3. Differences in initial stratospheric temperatures across models explain approximately half of 
the intermodel spread in IRF4xco2, as shown using abrupt-4xCO2 experiments. (A) Comparison of 
global and time mean IRF4xco2 in years 121 to 140 from the offline double-call and analytical model 
calculations with base state from abrupt-4xCO2 experiments. The correlation between global and time mean 
IRF4xco2 in every 10 of 150-year experiments from the offline double-call and the analytical model 
calculations ranges from 0.88 to 0.89. (B) Time series of global and annual mean 10 hPa air temperature 
under lpctCO2 scenario from CMIP6 models. Each gray line in (B) represents the 10 hPa temperature 
evolution of a model, and the thick black line shows the multimodel ensemble mean. The curly bracket in 
(B) highlights the correlation between 10 hPa air temperature at years 1 and 140. (€) Comparison of the global and 
time mean original analytical IRF4xco2 in years 2 to 11 with that obtained with perturbed stratospheric emission 
temperature from piControl runs (piCTL-Tstrat). The correlation between the global and time mean IRF4xco2 from 
the original and piCTL-Tstrat perturbed calculations ranges from 0.90 to 0.92. 


of IRF and stratospheric adjustment, or the 
“stratospheric adjusted” RF, are roughly equal 
at the tropopause and the TOA (30), thus pro- 
viding an accurate and computationally efficient 
analog for the total RF. Figure S6 compares 
the IRF, stratospheric adjustments, and strato- 
spheric adjusted RF from the CO, quadrupling 
for the two different base states (see the mate- 
rials and methods). The amip simulations 
exhibit a larger IRF (fig. S6A; 0.38 W m °) 
compared with that obtained under preindus- 
trial conditions because of the cooler strato- 
sphere. There is a nearly identical difference 
in the stratospheric adjusted RF between the 
two sets of experiments (fig. S6C; 0.34 W m ”) 
because almost no difference is seen in the 
stratospheric adjustments (fig. S6B; -0.03 Wm’). 
Note that the abovementioned ensemble mean 
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forcing differences are also corroborated by 
differences shown for individual models. Even 
though the direct contribution of the base 
state to the intermodel spread in stratospheric 
adjusted RF and total RF is smaller than it is 
for the IRF, because additional sources of 
spread contribute, there are high, significant cor- 
relations between the IRF and both the strato- 
spheric-adjusted RF and total RF (fig. $7, A 
and B). 

The state dependence of both the IRF and 
stratospheric adjustment was further explored 
using the more realistic, online, interactive, 
coupled simulations forced by abruptly halv- 
ing, doubling, and quadrupling CO, concentra- 
tion of the preindustrial era (abrupt-0.5xCO2, 
abrupt-2xCO2, and abrupt-4«CO2, respectively; 
see the materials and methods and table S5). 
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As expected, for every model analyzed, we 
found that the IRF grew in magnitude across 
the three sets of experiments for each succes- 
sive CO, doubling (fig. S8). The stratospheric- 
adjusted RF exhibited a nearly identical 
increase across the experiments, with the strato- 
spheric adjustment only weakly offsetting the 
increases. Similar increases per CO, doubling 
have also been found for the total RF esti- 
mated from atmosphere-only simulations with 
fixed sea surface temperatures (37). This in- 
dicates that with almost no counteracting 
effects from rapid adjustments, the radiative 
effects from the stratospheric temperature 
base-state dependence of the IRF extend to the 
total RF (figs. S6 to S8) and thus on to climate 
sensitivity. 

Changes in climate sensitivity can therefore 
arise from both changes in climate feedback 
and changes in IRF. More generally, these re- 
sults indicate that despite the logarithmic de- 
pendence of CO, absorption (28), the climate 
becomes increasingly sensitive to a doubling 
of CO, as the base-state CO. concentration 
increases and the stratosphere cools corre- 
spondingly. The IRF.,.co2 increases by ~25% 
for each doubling of base-state CO. concen- 
tration (i.e., the IRFo,.co2 increases by 24 and 
29% for the first and second doubling of base- 
state CO, concentration, respectively; Fig. 2A). 
Because the IRF accounts for about two-thirds 
of the total RF from CO, (J, 10-12), this implies 
that AT>,.co increases by ~15 to 20% for each 
doubling of CO, just from changes in the IRF. 
This state dependence of the IRF9,.co2, and 
thus AT>,.co2, has not been accounted for in 
the latest IPCC reports. 


Potential climate implications 


Because the upper stratospheric temperature 
plays a dominant role in determining the mag- 
nitude of the CO, IRF, any changes in atmo- 
spheric composition that perturb stratospheric 
temperature could subsequently affect the cli- 
mate. Consider the recent example of polar 
ozone depletion (32-34), which strongly influ- 
ences the temperature structure within the 
stratosphere (35). The ozone depletion since 
the 1970s has led to strong cooling within the 
stratosphere. By cooling the stratosphere, ozone 
depletion makes the forcing from the increase 
in CO, over this period more potent. Although 
the stratospheric ozone loss mainly occurs in 
the lower stratosphere (36, 37), the associated 
cooling also contributes to a decline in infrared 
emission from the lower to the upper strato- 
sphere, thus strengthening the CO, IRF at 
the TOA. 

Here, we examined this nonlinear interac- 
tion between ozone depletion-induced cooling 
and CO, IRF by comparing a 10-member en- 
semble of model simulations that use all his- 
torical forcings with the corresponding sum of 
model simulations in which each historical 
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Fig. 4. Any forcing that perturbs the stratospheric temperature can further affect the climate by 
modulating the radiative forcing by CO>. (A) Time series of three available online double-call IRF4.co2 
from CMIP6 historical simulations and the multimodel ensemble mean of corresponding offline double-call 
IRF4xco2 for CMIP6 models with both historical and historical well-mixed greenhouse gas-only (hist-GHG) 
simulations. (B) Ensemble mean map of the indirect surface warming effect of ozone depletion during the 


period 1985 to 2014. 


forcing is imposed independently (see the mate- 
rials and methods). According to our theory, 
model simulations in which ozone loss and 
CO, increase coincide should have a larger 
CO, forcing (and thus greater surface warm- 
ing) than the sum of individual model sim- 
ulations, in which each forcing is imposed 
separately in isolation from the other. The COz 
forcing in the latter is smaller because it is not 
enhanced by ozone depletion-induced cool- 
ing. We computed the indirect surface warm- 
ing effect of ozone depletion by taking the 
ensemble mean difference in surface tempera- 
ture anomalies between these two sets of ex- 
periments averaged over the period 1985 to 
2014 (see the materials and methods). 

As predicted, the sign and spatial distrib- 
ution of the nonlinear contribution of ozone 
loss to CO, IRF is consistent with a base-state 
dependence of IRF (Fig. 4B). Most of the in- 
direct surface warming effect occurs around 
the poles, where the local stratosphere has the 
strongest cooling, although some heat trans- 
port may also be playing a role (38, 39). The 
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smaller warming over the southern high lati- 
tudes likely reflects the greater rate of ocean 
heat uptake by the Southern Ocean (40, 41). 
This supports the premise that any forcing 
agent changes that perturb the stratospheric 
temperature could also affect the climate by 
modulating the CO, IRF at the TOA even with- 
out changing the CO. amount. 

Our findings may also help us better understand 
past climate events such as the end-Devonian 
mass extinction and the Paleoproterozoic 
“snowball Earth” conditions, which occurred 
after similar but considerably stronger pertur- 
bations [i.e., a substantial drop in stratosphere 
ozone (42) and the inevitable development of 
an ozone layer (43, 44), respectively]. The base- 
state dependence of the CO, IRF may have 
implications for how other related metrics are 
defined, such as global warming potential and 
efficacy of non-COz forcing (9, 29), because 
they are quantified relative to the radiative 
effects of a CO. perturbation. These metrics 
are often used in policy discussions, so it will be 
particularly important to determine whether 
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they must be redefined with consideration 
of the dynamic (i.e., nonconstant) behavior of 
CO, IRF. 

Additionally, our results may have implica- 
tions for geoengineering and climate change 
mitigation (45). Taking 1992, the year after the 
1991 eruption of Mount Pinatubo, as an exam- 
ple, the injected volcanic aerosols within the 
stratosphere not only cooled the surface by 
reflecting more solar radiation back to the 
space, but they also warmed the stratosphere 
by increasing the atmospheric absorption of 
sunlight in the stratosphere (46, 47). The re- 
sulting stratospheric warming weakened the 
CO, IRF (Figs. 1A and 4A) and reduced the 
warming efficacy of CO. Because most geo- 
engineering approaches involving stratospheric 
aerosol injection use reflective aerosols [e.g., 
sulfate (48)], alternative approaches that use 
more absorbing aerosols (e.g., black carbon) 
may warrant consideration, because this could 
effectively reduce the CO, greenhouse effect 
by warming the upper stratosphere (fig. S9) 
(49, 50). 

Finally, we note that the model simulations 
of stratospheric temperature can be easily con- 
strained with observations. Across multiple 
sets of observations and reanalyses (see the 
materials and methods and table S6), the glo- 
bal and annual mean 10 hPa air temper- 
ature had an uncertainty range of 226.6 to 
228.4 K in the year 2020. This ~1.8 K differ- 
ence in base state would translate to only an 
~0.16 (0.18) W m ? IRFy.co2 uncertainty for 
CMIP6 (CMIP5) models (Fig. 1C and fig. $2). 
This highlights the importance of accurately 
representing the stratosphere when project- 
ing future CO,-induced climate change and 
the potential to better constrain model pro- 
jections using observations, further emphasiz- 
ing the importance of continuing observations 
in Earth’s middle and upper atmosphere (57). 
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Boryl radical catalysis enables asymmetric radical 
cycloisomerization reactions 


Chang-Ling Wang"t, Jie Wang“t, Ji-Kang Jin?{, Bin Li?, Yee Lin Phang’, Feng-Lian Zhang*, Tian Ye’, 
Hui-Min Xia’, Li-Wen Hui’, Ji-Hu Su?, Yao Fu, Yi-Feng Wang'** 


The development of functionally distinct catalysts for enantioselective synthesis is a prominent yet 
challenging goal of synthetic chemistry. In this work, we report a family of chiral N-heterocyclic carbene 
(NHC)-ligated boryl radicals as catalysts that enable catalytic asymmetric radical cycloisomerization 
reactions. The radical catalysts can be generated from easily prepared NHC-borane complexes, 

and the broad availability of the chiral NHC component provides substantial benefits for stereochemical 
control. Mechanistic studies support a catalytic cycle comprising a sequence of boryl radical addition, 
hydrogen atom transfer, cyclization, and elimination of the boryl radical catalyst, wherein the chiral NHC 
subunit determines the enantioselectivity of the radical cyclization. This catalysis allows asymmetric 
construction of valuable chiral heterocyclic products from simple starting materials. 


he design and discovery of enantioselec- 

tive catalysts has been a persistent goal 

of organic chemistry, given the increas- 

ing demands for enantiomerically pure 

chiral products as pharmaceuticals and 
functional materials. Over the past decades, 
great progress has been achieved in the devel- 
opment of catalytic enantioselective reactions 
involving radical species as key intermediates 
(1-4), in which a variety of elegant catalytic 
approaches—including the use of Lewis acids 
(5-7), organocatalysts (8, 9), transition metals 
(10-14), enzyme catalysts (15, 16), and photo- 
catalysts (17-20)—have been reported. Despite 
these exciting advances, the exploitation of 
structurally and functionally distinct chiral 
catalysts and mechanistically distinct cata- 
lytic reactions is highly desirable and yet chal- 
lenging to accomplish. 

Free organic radicals are commonly involved 
as stoichiometric promoters or intermediates 
in a variety of reactions (27). By contrast, the 
use of free organic radicals as the catalysts 
remains limited (22) because the short life- 
time and the high reactivity of these species 
pose a marked challenge in the realization of 
efficient catalytic cycles. However, their specific 
chemical properties can promote notable mo- 
lecular transformations that are not easily 
accessible by other known catalytic methods. 
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In this regard, a set of radical species, including 
thiyl radicals (23-25), stannyl radicals (26, 27), 
nitrogen radicals (28, 29), bromine radicals 
(30, 31), and boron-centered radicals (32-34), 
have been reported as effective catalysts for cy- 
clization reactions. Nevertheless, asymmetric 
reactions using chiral radical catalysts remain 
a daunting challenge (35-37). In this context, 
as depicted in Fig. 1A, the radical catalyst un- 
dergoes addition to a prochiral substrate to 
form a covalent bond, and the resulting rad- 
ical intermediate participates in enantiose- 
lective radical transformations, wherein the 
bound chiral unit determines the enantiose- 
lectivity. Eventually, an elimination reaction 
occurs to break the formed covalent bond, 
furnishing a chiral product and regenerating 
the radical catalyst. The major difficulty for 
such asymmetric radical catalysis lies in the 
lack of competent chiral catalysts, on which 
the chiral unit should be broadly available 
and easy to install and, more importantly, 
should cooperate with the radical center to 
achieve both high catalytic efficiency and 
enantioselectivity. 

Accordingly, only a very limited number of 
chiral radical catalysts and associated enantio- 
selective reactions have thus far been reported 
(Fig. 1B, left). For example, the Zhang group 
reported a series of cobalt-based metallorad- 
ical catalysts in which the chiral porphyrin 
ligands played a dual role in controlling both 
the reactivity and the stereoselectivity (35). 
These cobalt catalysts were specifically effec- 
tive in activating diazo and azide compounds 
and then triggering enantioselective catalytic 
radical transformations (38, 39). In 2014, the 
Maruoka group designed an organic thiyl rad- 
ical that could catalyze an enantioselective 
cycloaddition of vinylcyclopropanes and al- 
kenes (36), wherein strain-promoted ring 
opening served as an important driving force 
to favor the thiyl radical addition (23, 24). Very 
recently, Miller and colleagues reported a sim- 
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ilar enantioselective catalytic reaction ust ed 
peptide-based thiyl radical (40). Despite tL-22 
exciting findings, the inherent reactivity of 
these radical catalysts limits their applicability 
to other types of catalytic reactions. Moreover, 
the types of competent chiral components are 
limited, and the preparation of the chiral pre- 
catalysts usually requires a lengthy series of 
steps, thereby hindering their applications. To 
achieve a conceptually distinct and generally 
applicable asymmetric radical catalysis, there 
is a compelling need to find a group of ele- 
ment radical catalysts that not only display 
specific chemical reactivity to enable mech- 
anistically distinct catalytic cycles but can 
also be readily connected with a chiral unit 
that is widely available and easily allows struc- 
tural diversification. 

N-heterocyclic carbene (NHC)-bory] radicals, 
aclass of boron-centered radicals ligated with an 
NHC component, have shown specific chemical 
reactivity and have recently found application 
in chemical synthesis (417, 42) since the pio- 
neering work by the groups of Fensterbank, 
Lacéte, Malacria, and Curran (43). These rad- 
icals can be easily generated from readily avail- 
able NHC-BH3; complexes through hydrogen ‘ 
atom abstraction with the aid of a radical ini- 
tiator (43). The Walton, Lacéte, and Curran 
groups (44) and our group (45, 46) discovered 
that the addition of NHC-bory]l radicals to al- ‘ 
kenes was a reversible process, which suggests 
that these radicals may be used as competent 
catalysts in radical addition-elimination pro- 
cesses. On this basis, we reported an NHC-boryl 
radical-catalyzed cycloisomerization reaction 
of N-(2-ethynylaryl)arylamides (32). Encouraged 
by this finding, we perceived that the use of a 
chiral NHC-ligated boryl radical as the catalyst 
would be able to achieve extraordinary enan- 
tioselective catalytic radical reactions (Fig. 1B, « 
right). It is known that a wide variety of chiral ‘ 
NHC precursors are either commercially avail- 
able or can be prepared easily, and a simple 
treatment with BH3°THF in the presence of a 
base can provide a vast array of bench-stable 
chiral NHC-BH; complexes (43). This advantage 
makes it easy to obtain a large library of NHC- 
boryl radicals tethered with various structurally 
tunable chiral components, which is highly val- 
uable for reaction screening and development. 

In this work, we report the development 
of a general and efficient mode of asymmetric 
NHC-boryl radical catalysis, which enables 
asymmetric radical cycloisomerization reactions 
for the rapid assembly of a range of enantio- 
enriched five- and six-membered heterocycles 
(Fig. 1C). The catalytic cycle proceeds through 
a sequence of boryl radical addition to alkynes, 
hydrogen atom transfer (HAT), cyclization, and 
elimination of the boryl radical catalyst, dur- 
ing which the chiral NHC unit creates a chiral 
microenvironment that can exert effective 
stereochemical control in the C-C bond-forming 
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Fig. 1. Radical catalysts for catalytic enantioselective reactions. 

(A) The catalytic cycle for asymmetric radical catalysis. (B) Reported 
radical catalysts together with their catalyzed enantioselective reactions 
(left) and the unexplored chiral NHC-boryl| radical catalysts (right). 
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(C) The design of two catalytic cycles enabled by chiral NHC-boryl 
radical catalysts. (D) Chiral pyrrolidines and pyrrolo[3,2,1-i/]quinolines 
containing an a-stereocenter are important motifs in medicinally 
relevant molecules. 
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Fig. 2. Reaction development. (A) Bory! radical-catalyzed asymmetric cycloisomerization of N-benzyl homopropargyl! amine 1a. (B) Bory! radical—-catalyzed 
asymmetric cycloisomerization of N-benzyl-indol-7-yl propiolate 3a. (€) Screening of chiral NHC-BH3 precatalysts. 


cyclization step. The reaction mechanism and 
the origin of enantioselectivity have been elu- 
cidated by experimental and computational 
studies. The obtained heterocycles containing 
an a-stereocenter resemble chiral fragments 
of pharmaceuticals or targeted compounds 
in medicinal chemistry studies (Fig. 1D) (47-49), 
but their enantioselective synthesis by asym- 
metric catalysis from simple starting materials 
remains a challenge. 


Reaction design 


During the course of our study on NHC-boryl 
radical-triggered cascade cyclization reactions 
(41), we hypothesized that if a cascade could 
generate a B-boryl alkyl radical intermediate, 
B-elimination would take place to liberate the 
boryl radical, thereby completing a radical cat- 
alytic cycle. Furthermore, our previous work 
had revealed that the regioselectivity of the 
boryl radical addition to alkynes was tunable 
by varying the alkynyl substituents (45, 50). 
This feature would allow divergent catalytic cy- 
cles that might yield a diverse family of cyclic 
molecules. On the basis of these considerations, 
we posited two boryl radical-catalyzed cycloisom- 
erization reactions that proceed through dis- 
tinct cycles (Fig. 1C). In cycle A, an NHC-boryl 
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radical attacks C, of 1 when R’ has more rad- 
ical stabilizing capacity, giving vinyl radical I. 
After that, a 1,6-hydrogen atom shift occurs to 
give alkyl radical II, which then undergoes 
5-exo cyclization to afford radical III with the 
construction of the five-membered ring skeleton 
(51, 52). The chiral unit on the boron biases the 
trajectory of the radical approaching the boron- 
substituted alkenyl carbon atom, thus realizing 
enantiocontrol. Finally, the NHC-boryl radical 
catalyst is eliminated from III to complete the 
catalytic cycle with the formation of product 2. 
When an aryl ring is attached at C,, the NHC- 
boryl radical catalyst undergoes addition to C; 
of 3 instead (50), and the subsequent 1,5-H 
shift-cyclization-elimination sequence completes 
the catalytic cycle, furnishing product 4. Again, 
the enantioselectivity is determined by the chiral 
NHC component during the cyclization step. 


Reaction development 


To verify the feasibility of our proposed two 
catalytic cycles, la and 3a were chosen as the 
model substrates for reaction development. First, 
the catalytic reactions were proven effective using 
a simple achiral NHC-BHs (1,3-dimethylimidazol- 
2-ylidene borane) as the precatalyst, affording 
3-methylenepyrrolidine 2a and pyrrolo[3,2,1- 
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i|quinoline 4a in 96% and 85% yields, respec- 
tively (table S1, entry 1, and table S2, entry 1). 
Control experiments showed that no reaction 
occurred in the absence of a radical initiator 
(table S1, entry 2, and table S2, entry 2), which 
confirms the radical reaction mechanism of 
both transformations. Encouraged by this find- 
ing, we next investigated enantioselective boryl 
radical-catalyzed transformations using chiral 
NHC-BHs as the precatalyst (Fig. 2). Exten- 
sive screening revealed that the reaction of la 
with B-1 (20 mol %) provided 2a in 83% yield 
with high levels of both asymmetric induction 
[96:4 enantiomeric ratio (er)] and E/Z selec- 
tivity (>95:5). Reducing the catalyst loading 
to 10 mol % resulted in a drop of product yield 
to 46% with the recovery of 45% yield of 1a, 
mainly because of the short lifetime and high 
reactivity of this boryl radical species that may 
cause its fast consumption through other path- 
ways. Changing the ¢-Bu group in B-1 to a Bn 
(B-2, 73:27 er) or 7-Pr (B-9, 59.5:40.5 er; fig. S1) 
led to a marked decrease in enantioselectivity, 
which indicates an important steric shielding 
effect of this t-Bu group in the enantioselective 
cyclization step. Other types of chiral NHC-BH; 
complexes (B-3 to B-8) gave inferior yields and 
stereoselectivities. For the reaction of 3a, B-5 
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Fig. 3. Substrate scope 
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was found to be the optimal precatalyst, af- 
fording product 4a in 94% yield and 93.5:6.5 er. 
The replacement of the N-aryl group with a 
t-Bu could maintain good er (B-6, 89.5:10.5) 
while a drop of product yield was resulted. The 


reactions with other types of chiral NHC skel- 
etons (B-1 to B-4 and B-7 to B-8) were found 
to be much less effective. Notably, gram-scale 
syntheses of 2a and 4a were also achieved in 
good yields while maintaining excellent enan- 


tioselectivities. The use of ent-B-1 and ent-B-5 
as the catalysts reversed the stereochemistry, 
giving 2a and 4a in 4.5:95.5 er and 7:93 er, 
respectively (figs. S1 and S2). The results of 
screening other chiral NHC-BHs; precatalysts 
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Fig. 4. Substrate scope of NHC-boryl radical—-catalyzed asymmetric cycloisomerization of N-benzyl-indol-7-yl propiolates 3. Reaction conditions: substrates 3 
(0.1 to 0.3 mmol), B-5 (20 mol %), ABVN (70 mol %), isopropyl ether (0.1 M), 60°C, 24 hours, under nitrogen. Isolated yields after chromatography are given. 
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are provided in the supplementary materials 
(figs. SI and S2). 


Substrate scope for two catalytic asymmetric 
radical cycloisomerization reactions 


The NHC-boryl radical catalyzed enantiose- 
lective isomerization reactions through both 
catalytic cycles A and B showed a broad substrate 
scope and good to excellent levels of enantiocon- 
trol. Using B-1 as the precatalyst and azobiscy- 
clohexanecarbonitrile (ACCN) as the radical 
initiator, a broad range of N-homopropargyl 
amines 1 were stereoselectively converted to the 
corresponding 2-aryl-3-benzylidenepyrrolidines 
2 (Fig. 3). The reaction worked well when the 
R’ substituent was an aryl ring. Aryl rings bear- 
ing an electron-withdrawing group (2a to 2c) 
generally provided higher product yields com- 
pared with a ring substituted with an electron- 
donating group (2e), likely because the electron 
deficiency promotes nucleophilic boryl radical 
addition. Changing the aryl group to an ester 
(1f) or an amide (1g) moiety was also com- 
patible with the cycloisomerization, furnish- 
ing the corresponding products 2f and 2g in 
good yields and excellent er, although a lower 
E/Z selectivity was obtained. An aryl ring as 
the R” substituent was found to be important 
to trigger the catalytic cycle because aryl sta- 
bilization is the main factor enabling intramo- 
lecular HAT and the subsequent cyclization 
step. A broad array of (hetero)aryl rings, includ- 
ing naphthalene (2j), carbazole (2k), indole 
(21), benzofuran (2m), pyridine (2n), thiophene 
(20), and furan (2p), could be installed success- 
fully in synthetically useful yields and excellent 
er. No detrimental effect was detected when other 
functional groups, such as ¢-butyloxycarbonyl 
(2q), benzoyl (2r), and 5-hydroxymethyl-2-furoyl 
(2s), were incorporated at the nitrogen atom. 
The absolute configuration of 2r was confirmed 
by x-ray crystallographic analysis. Furthermore, 
the present catalytic method allowed for the 
construction of a 3-benzylidenetetrahydrofuran 
framework (2t and 2u), albeit with a slight 
decrease of er. In place of an aryl group, a 
tert-butoxycarbonyl (2v and 2w) and N,N- 
dimethylaminocarbonyl (2x) could be used as 
substituent R?, affording the corresponding 
pyrrolidines in moderate to good yields and 
er. Compound 2v was obtained using ent-B1 
as the catalyst. 

The catalytic cycloisomerization of 1-benzyl- 
7-alkynyl-indoles 3 proceeded smoothly with 
20 mol % of B-5 in the presence of 2,2'-azobis 
(2,4-dimethyl)valeronitrile (ABVN) as the radical 
initiator, affording a broad range of pyrrolo[3,2,1-7/] 
quinolines 4 in good yields and enantioselec- 
tivities (Fig. 4). An aryl group as R* was again 
important to enable intramolecular HAT. A 
series of substituents at different positions on 
the aryl ring (4b to 4f) were compatible, and a 
thiophene moiety could be installed as well 
(4g). A wide variety of substituted indoles (R° 
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and R°) underwent this catalytic transforma- 
tion stereoselectively to afford the correspond- 
ing products in good yields. For example, a large 
number of 2-alkyl (4h and 44), 2-cycloalkyl (47 
and 4k), 2-allyl (41), and 2-aryl (4m to 40) sub- 
stituted pyrrolo[3,2,1-77]quinolines were suc- 
cessfully assembled with high yields and good 
er. The absolute configuration of 40 was also 
established by x-ray crystallography. Other types 
of tetracyclic products 4p and 4q were formed 
in good yields without any loss of enantiose- 
lectivity. Carbazole 3r was also a viable sub- 
strate, delivering the desired product 4r in 
88% yield with 87.5:12.5 er. When the R° sub- 
stituent on the indole ring was a hydrogen atom, 
the reaction afforded product 4s in good yield 
and er. Further studies on the effect of the R? 
group revealed that a methoxycarbonyl group 
was crucial to enable efficient boryl radical ad- 
dition, and the change to an electron-deficient 
aryl group (4t) or a CF; group (41) led to no 
formation of the product. We speculated that, 
in these cases, the boryl radical addition to the 
internal alkyne moiety might take place with 
much lower regioselectivity, resulting in rapid 
consumption of the bory] radical catalyst. As 
a result, no cycloisomerization product was 
detected. 


Synthetic applications and a catalytic 
intermolecular reaction 


To illustrate the synthetic utilities of these two 
catalytic cycloisomerizations, the two obtained 
types of cyclic products were readily converted 
to a variety of versatile building blocks (Fig. 5). 
For example, hydroboration of the alkene mo- 
tif of 2a with BH3°THF followed by debora- 
tion or oxidation led to hydrogenation product 
5 or alcohol 6, respectively, in good yields with 
the maintenance of excellent enantioselectivity, 
and only single diastereomers were detected 
in both reactions. The absolute configuration 
of 6 was ascertained by x-ray crystallography. 
Notably, a simple treatment of alcohol 6 with 
CF3SO3H triggered a stereoselective intramolec- 
ular cyclization to produce fused cyclic product 
7 in 87% yield with >95:5 diastereomeric ratio 
(dr). Moreover, the reaction of 2r with bromine 
and the subsequent base-promoted elimination 
afforded alkenyl bromide 8 in 74% yield with 
94.5:5.5 er. This bromide would serve as a ver- 
satile precursor to couple with a wide variety 
of functional groups with the aid of transition 
metal catalysis. In addition, the ester moiety of 
product 4a was easily reduced to alcohol 9 in 
good yield. Hydrogenation of the alkene moiety 
afforded a quinoline framework 10, which is 
a core motif of many biologically active mol- 
ecules (48). Subjection of 4a to our previously 
developed radical hydroboration protocol using 
1,3-dimethylimidazol-2-ylidene borane as the 
boryl radical precursor (53) formed a-borylated 
ester 11 in good yield with exclusive regio- and 
stereoselectivity; absolute configuration was 
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again confirmed by x-ray crystallography. This 
compound could be transformed to 1,2-diol 12 
through the reduction of the ester unit fol- 
lowed by the oxidation of the boron moiety. 
Notably, alkyl pinacol boronic ester 13 was 
obtained when the resulting reduction interme- 
diate was treated with pinacol in the presence 
of HCl. A plausible reaction mechanism for 
this unexpected transformation is discussed 
in the supplementary materials (fig. S3). The 
boron moiety of compound 13 could serve as 
aversatile handle to access a large number of 
functionalized quinoline derivatives, which 
could be of interest in medicinal chemistry 
studies (48). The stereochemistry was large- 
ly retained in all these transformations. 

This asymmetric radical catalysis protocol 
also provides highly efficient methods for the 
synthesis of key intermediates toward cyclo- 
thialidine and N-methyl-p-aspartate (NMDA) : 
receptor antagonist Ro 67-8867 (Fig. 5C). For 
example, ozonolysis of 2v followed by reduc- 
tion with NaBH, afforded compound 14, which 
is an important intermediate for cyclothialidine 
(49), in 81% yield as a single diastereomer while 
maintaining good enantioselectivity. Hydro- 
genation of the alkene motif of 2w and the 
subsequent reduction with LiAlH, gave com- 
pound 15, which underwent ring expansion and 
hydrogenolysis to deliver 3-hydroxy-piperidine 
16 with excellent diastereo- and enantioselec- 
tivity. An alkylation reaction of 16 following 
the known procedure could produce Ro 67- 
8867 (54). 

The asymmetric NHC-boryl radical cataly- 
sis was also applicable for an enantioselective 
intermolecular [2+2+2] cyclization reaction 
(55). As depicted in Fig. 5D, the catalytic cycle 
was initiated from the addition of the boryl 
radical catalyst to the triple bond of propiolate 
17, and the resulting alkenyl radical added to 
one equivalent of alkene 18 to give a new alkyl 
radical. The addition to a second molecule of 
alkene 18 followed by intramolecular cycliza- 
tion constructed a cyclohexane ring. Eventually, 
a B-elimination took place to afford product 
19 with the regeneration of the boryl radical 
catalyst. Although moderate yield and enantio- 
selectivity of product 19 were obtained (fig. S4 
and table S3), this exemplified catalysis sug- 
gests a catalytic radical [2+2+2] method to as- 
semble chiral six-membered rings. 


Mechanistic considerations 


Because chiral NHC-BH3; complexes have the 
potential to act as HAT catalysts (56, 57)— 
namely, the boryl radical abstracts one of the 
benzylic hydrogen atoms and eventually re- 
turns a hydrogen atom to the alkenyl radical 
generated after the intramolecular cyclization— 
the catalytic reactions of la-D, and 3d-Dz, 
which both contain two benzylic deuterium 
atoms, were examined. One of the deuterium 
atoms was fully transferred to the alkenyl carbon 
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Fig. 5. Synthetic applications and a catalytic 
intermolecular reaction. (A) The conversion 

of products 2 to other common and useful 
functional groups. (B) The conversion of product 
4a to other functional groups. Reaction conditions: 
(a) BH3*THF (2.5 equiv), THF, 0°C to room 
temperature (rt), 4 hours, then NaOH (20% aq.), 
0°C, 20 min; (b) BH3*THF (2.5 equiv), THF, 0°C 
to rt, then H202 (30% aq.), NaOAc (20% aq.), 
0°C, 30 min; (c) CF3SO3H (10 equiv), CHsClo, 
O0°C, 24 hours; (dl) Br2 (2 equiv), CCl4, 0°C to rt, 
30 min; (d2) DBU (3 equiv), CH2Cls, rt, 12 hours; 
(e) LiAIHg (3 equiv), THF, -78°C to 0°C, 12 hours; 
(f) Pd/C (0.3 equiv), H2, THF-EtOH (1:1), 50°C, 
13 hours; (g) 1,3-dimethylimidazol-2-ylidene borane 
(1.2 equiv), 4-CF3CgH4SH (30 mol %), isopropy! 
ether, 60°C, 24 hours; (h1) LiAIH, (3 equiv), ZnCl. 
(1 equiv), THF, 0°C to rt, 5 hours; (h2) H202 
(30% aq.), NaOH (20% aq.), MeOH-CH3CN (1:1); 
(i}) LiAIH, (3 equiv), ZnClo (1 equiv), THF, 0°C to 
rt, 5 hours; (i2) HCl (4M in dioxane, 1 equiv), 
pinacol (2.5 equiv), CH3CN, 0°C, 12 hours. DBU, 
1,8-diazabicyclo[5.4.0]-7-undecene. (€) Synthesis of 
key intermediates for biologically active molecules. 
Reaction conditions: (j1) 03, CH2Clz, -78°C, 5 min 
then PPh3 (2 equiv); (j2) NaBH, (3 equiv), MeOH- 
CHCl, -78°C, 1 hour; (kl) H2, Pd/C (10%) 
EtOH, rt, 3 hours; (k2) LiAIH, (4 equiv), THF, rt, 
1 hour; (I1) (CF3CO)20 (1.5 equiv), THF, 0°C, 

1 hour; (12) EtsN (3 equiv), THF, O°C to reflux, 
15 hours; (13) NaOH (2.5 M, 0.5 mL), rt, 2 hours; 
(l4) Hz, Pd/C (10%), EtOH, 55°C, 3 hours. (D) An 
NHC-boryl radical-catalyzed enantioselective 
intermolecular cyclization reaction. 
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atom, and the D-labeled products were formed 
in good yields with excellent er (figs. S5 and S6). 
These results excluded a HAT catalysis path- 
way. This method would be of great value in the 
synthesis of enantioenriched D-incorporated 
analogs that may be of interest in medicinal 
chemistry studies. Furthermore, monodeuter- 
ated la-D, and 3d-D, were prepared as sub- 
strates for the study of intramolecular kinetic 
isotopic effect (KIE) of the two catalytic reac- 
tions using the corresponding chiral precat- 
alysts (figs. S7 and S8). The intramolecular KIE 
values were determined to be 5.3 and 3.5 for 
la-D, and 3d-D,, respectively, which are in 
good agreement with the proposed step of 
C-H bond cleavage through intramolecular 


nance (EPR) spin-trapping experiments (58) 
verified the generation of NHC-boryl radicals 
from B-1 and B-5 under the in situ reaction 
conditions (figs. S9 and S10). 

Next, density functional theory (DFT) calcu- 
lations were performed to gain more insight 
into the reaction mechanism and the origins 
of enantioselectivity. The energy profile of the 
reaction of 1a is depicted in Fig. 6 (details are 
provided in the supplementary materials). The 

NHC-boryl radical derived from B-1 is pre- 

dicted to add to the alkynyl C, of la with an 

activation free energy of 9.0 kcal/mol to give 
aryl-stabilized alkenyl radical Int-I. The subse- 
quent 1,6-H shift proceeds through TS-II with 


HAT. Moreover, electron paramagnetic reso- 
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an energy barrier of 14.1 kcal/mol. From Int-I, 


diastereoisomeric intermediates, among which 
trans (Int-II-trans-R and Int-II-trans-S) 
diastereomers form with lower activation ener- 
gies compared with their cis counterparts, most 
likely because of the lower steric repulsion be- 
tween the boron unit and the Ar’ group. Careful 
inspection of the structures of two trans cy- 
clization transition states elucidates the origins 
of enantio-induction by the bulky ¢-Bu on the 
oxazole ring. The enantioisomeric transition 
state TS-II-trans-R is favored over TS-III- 
trans-S by 3.7 kcal/mol, which is correlated well 
with the observed excellent enantioselectivity 
(96:4 er). The major reason for the differenti- 
ation is the highlighted steric hindrance be- 


the cyclization may occur to produce two sets of 
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tween one of the ¢-Bu hydrogen atoms and the 
allylic hydrogen atom of the substrate. This is 
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consistent with experimental findings that the 
change of this +Bu group to 7-Pr or Bn leads to 
a precipitous drop of er. Eventually, the NHC- 
boryl radical elimination takes place stereo- 
selectively via TS-IV-E to give the E isomer. 
The transition state TS-IV-Z to form the Z 
isomer has more steric congestion between 
the two aryl rings (Ar’ and Ar”), thus resulting 
in an elevated free activation energy (AAG* = 
2.1 kcal/mol). Therefore, only 2a with E ge- 
ometry is obtained as a single stereoisomer. 

The energy profile of the reaction of 30 (fig. 
S11) and its detailed discussion are provided in 
the supplementary materials. The results sug- 
gest that the enantioselectivity is determined 
in the 6-endo-trig cyclization step. Noncova- 
lent interaction (NCI) and atoms-in-molecules 
(AIM) analysis performed with Multiwfn (59) 
showed that the presence of attractive non- 
covalent interactions in TS-HI-RS-3o (fig. 
S1D, including the a-n stacking between the 
triazole ring and the indole moiety; CH/z 
attractions between both H, and H, on the 
NHC part and the indole moiety; and the 1-n 
interaction between the ester group and the 
phenyl moiety make it more favorable over 
TS-III-SR-3o by 1.8 kcal/mol, which is cor- 
related well with the enantioselectivity ob- 
served in the experiment. We anticipate that 
the broad accessibility and specific chemical 
reactivity of chiral NHC-boryl radicals will 
enable additional powerful catalytic enantio- 
selective radical transformations with wide 
applicability. 
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3D microscopy at the nanoscale reveals unexpected 
lattice rotations in deformed nickel 


Qiongyao He**+, Seren Schmidt?+, Wanquan Zhu**°, Guilin Wu", Tianlin Huang", Ling Zhang*“, 
Dorte Juul Jensen’, Zongqiang Feng**, Andrew Godfrey®*, Xiaoxu Huang?** 


In polycrystalline metals, plastic deformation is accompanied by lattice rotations resulting from 
dislocation glide. Following these rotations in three dimensions requires nondestructive methods that so 
far have been limited to grain sizes at the micrometer scale. We tracked the rotations of individual grains 
in nanograined nickel by using three-dimensional orientation mapping in a transmission electron 
microscope before and after in situ nanomechanical testing. Many of the larger-size grains underwent 
unexpected lattice rotations, which we attributed to a reversal of rotation during unloading. This inherent 
reversible rotation originated from a back stress—driven dislocation slip process that was more active 
for larger grains. These results provide insights into the fundamental deformation mechanisms of 
nanograined metals and will help to guide strategies for material design and engineering applications. 


ost materials are polycrystalline, mean- 

ing that they are composed of a large 

number of grains of varying size and 

orientation. The characteristics and 

arrangement of these grains, and of 
the grain boundaries (GBs) between the grains, 
are fundamental in determining material prop- 
erties, including plasticity, i.e., the ability to 
undergo permanent shape change (/, 2). In 
coarse-grained metals and alloys, plasticity 
is most typically sustained by the nucleation 
and glide of dislocations (linear crystal lat- 
tice defects). This behavior is accompanied 
both by crystal lattice rotations and by work 
hardening, which occurs due to the storage 
of dislocations during the deformation pro- 
cess (3, 4). In nanograined metals, the origin 
of plasticity is not clear because of the in- 
creased difficulty for operation of grain inte- 
rior dislocation sources (5, 6). Experiments 
and simulations have suggested a range of pos- 
sible alternative mechanisms, many of which 
are related to GB-mediated deformation (7, 8), 
including GB sliding (9, 10), GB migration 
(11, 12), and grain rotation (13, 14), although 
generally these are expected only to be impor- 
tant for grain sizes below 10 to 15 nm (J5, 16). 
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Dislocations (17-19), stacking faults (20, 21), 
and deformation twinning (22) are, however, 
still frequently observed in deformed nano- 
grains, indicating that some form of dislocation- 
mediated plastic deformation is still active 
at the nanoscale. Another difference between 
deformation of coarse- and nanograined me- 
tals is that plastic strain in conventional coarse- 
grained materials is usually irrecoverable. 
By contrast, substantial time-dependent re- 
covery of plastic strain has been observed 
directly in nanograined aluminum and gold 
(23) and also indirectly in nanograined nick- 
el based on observations of a reversal in x-ray 
diffraction peak broadening during unload- 
ing (24). 

Another direct, though experimentally chal- 
lenging, approach for the investigation of plas- 
tic deformation is by following the change in 
orientation of individual grains inside a three- 
dimensional (3D) volume of a polycrystalline 
sample. For coarse-grained metals, this has 
been achieved using x-ray synchrotron meth- 
ods (3). For nanograined samples, previous 
observations have been limited to 2D measure- 
ments such as those on columnar nanograined 
palladium, in which a reversal of individual 
grain rotations on unloading has been reported 
(25). However, recent advances in the tech- 
nique for 3D orientation mapping in a trans- 
mission electron microscope (3D-OMiTEM) 
(26-28) now allow rapid, nondestructive 3D 
mapping of the shapes and crystallographic 
orientations of hundreds of nanograins within 
a specimen. 

We used the 3D-OMiTEM technique to pro- 
vide data on the crystallographic lattice rotations 
during compression of almost 300 individual 
nanosized grains in a nanograined nickel pil- 
lar by tracking the grain orientations before 
and after compression. In combination with in 
situ dark-field observations, our results reveal 
an unexpected reversal of plastic deforma- 
tion during unloading. This leads to overall 
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lattice rotations for many grains, in parti ie 
for larger nanograins, that differ compi--—— 
with previous observations on coarse-grained 
metals. As a consequence, some grains exhibit 
either only a small overall rotation angle or even 
an overall rotation in the opposite direction of 


that expected. 


3D orientation mapping of nanograined nickel 


For this experiment, we prepared a nanograined 
nickel sample using direct current electro- 
deposition (29). Plan-view TEM observations 
revealed that the as-deposited sample con- 
sisted of nanoscale equiaxed grains without 
visible interior dislocations (fig. S1). We ob- 
served a moderately strong <111> fiber texture 
in the deposition direction, which is a typical 
growth texture associated with electrodeposi- 
tion. We fabricated a submicrometer-size pil- 
lar by focused ion beam milling, with the pillar 
axis taken perpendicular to the sample depo- 
sition direction, thereby ensuring a random 
distribution of crystal directions parallel to the 
compression axis for the grains in the pillar 
sample. We determined the 3D grain structure 
in the pillar sample using 3D-OMiTEM, after 
which we transferred the pillar to a PicoIndenter ‘ 
sample holder and then compressed it inside 
the TEM (fig. S2). After deformation, the sam- 
ple was transferred back to the original TEM 
holder, and the 3D grain structure was again * 
determined using 3D-OMiTEM. 

We provide examples of the 3D microstruc- 
tures of the nanograined nickel pillar before 
and after compression, as characterized using 
3D-OMiTEM (Fig. 1, A and B), where each grain 
is colored corresponding to its crystallographic 
orientation. The reconstructed volume of the 
as-fabricated pillar was close to 12,000,000 nm?, 
with a diameter ranging from 100 nm at the 
top to 180 nm to the bottom, providing several .« 
hundred grains with a wide spectrum of crys- ‘ 
tallographic orientations. The grains in the re- 
constructed volume were mostly separated 
by high-angle GBs (fig. S3) with grain sizes 
spanning a wide range, with a mean value of 
~20 nm (based on an equivalent sphere diam- 
eter calculation). The 3D shapes of many grains 
were equiaxed, as revealed in the plan-view 
direction, in good agreement with the 2D ob- 
servation, but were slightly elongated in the 
pillar cross-section. The grain sizes, morphol- 
ogy, crystallographic orientations, and spatial 
arrangement reveal the complex and hetero- 
geneous nature of the 3D grain structure. 

The as-fabricated pillar had a sidewall taper 
angle of ~3.5°. For this reason, the uppermost 
part of the pillar underwent localized inho- 
mogeneous plastic deformation and exhibited 
a substantial shape change after compres- 
sion (Fig. 1B). We focus from here only on the 
grains in the volume below this region (from 
positions below the arrows in Fig. 1B). We de- 
termined the variation in compression strain 
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Fig. 1. 3D-OMiTEM characterization of the deformation-induced structure changes in nanograined 
nickel. (A and B) 3D orientation mapping of the nanograined nickel pillar before (A) and after (B) compression. 
Grains are colored based on the Euler angle representation of the crystallographic orientation such that 
similarly colored grains have similar crystallographic orientations. (C and D) 3D orientation mapping of the 
analyzed volume below the arrows displayed in (B) before (C) and (D) after compression. (E and F) Zooming 
in on one embedded nanosized grain before (E) and after (F) compression shows its 3D grain morphology and 
crystallographic orientation evolution (indicated in the figure as Euler angles). 


along the length of the pillar based on the dis- 
placement of the center of mass of each tracked 
grain and found it to vary smoothly from 5.8% 
at a depth of 225 nm to 4.6% at 775 nm, with 
an average strain of 4.8 + 0.3% (fig. S4A). After 
compression, the overall microstructure in the 
analyzed volume remained largely unchanged 
(Fig. 1, C and D), whereas the position, size (fig. 
S4B), and crystallographic orientation of indi- 
vidual grains changed slightly. We show an 
example for a grain refined in size from 20.8 to 
19.7 nm and with an orientation change of 1.1° 
with respect to its initial state (Fig. 1, E and F). 
To analyze the deformation-induced lattice 
rotations of the nanosized grains, we compared 
the average orientation of each grain after com- 
pression with that in the initial state for a total 
of 289 grains. 


Observation of grain rotations during loading 
and unloading 


The compression axis rotation of each grain 
(i.e., comparing the crystal direction parallel 
with the compression axis before and after de- 
formation) varied between 0° and 8°, with an 
average rotation of 1.8° (fig. S5A). The compres- 
sion axis rotations exhibited a complex pattern 
(fig. S5B) in which the rotation was not uniquely 
determined by the initial compression axis di- 
rection. This pattern was indicated by the large 
differences in the magnitude of the compression 
axis rotation for grains with similar initial ori- 
entation (Fig. 2A). The lattice rotations can, 
however, be classified into two distinct types 
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by comparing the observed rotation paths 
with those calculated using a visco-plastic self- 
consistent (VPSC) crystal plasticity model based 
on deformation accommodation by dislocation 
slip (Fig. 2B). The rotation paths of 54% of the 
grains were roughly comparable to the VPSC 
predictions (Fig. 2C, type A), meaning that these 
grains followed an orientation-dependent lat- 
tice rotation similar to that previously observed 
for coarse-grained metals (30). By contrast, the 
remaining 46% of the grains rotated in direc- 
tions opposite to those predicted by the VPSC 
simulation (Fig. 2D, type B). 

To directly observe the rotation of individual 
nanosized grains in the TEM, we performed 
dark-field imaging during in situ compression 
of the pillar, with images recorded at video frame 
rate (see movie S1). We show a series of frames 
taken during the loading and unloading proce- 
dure (Fig. 3). In the initial stages of loading, 
only minor microstructural changes took place 
away from the top of the pillar (Fig. 3, A and 
B). However, with further straining, detectable 
rotation of many grains took place (Fig. 3, B to 
D). For example, close to 25% of grains that were 
not oriented in a diffracting condition in the un- 
loaded state (i.e., those that are seen as dark in 
these images; e.g., grains 2 and 3) rotated into 
a diffracting condition during loading to be- 
come gradually brighter in these images. Sim- 
ilarly, nearly 25% of the grains that were in a 
strong diffraction condition before loading ro- 
tated out of this diffraction condition by the 
time the sample was fully loaded (e.g., grain 4). 
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Unexpectedly, lattice rotations also took place 
during unloading (Fig. 3, E to H). For example, 
grains 2 and 3, which rotated into a strong dif- 
fraction condition during loading, gradually 
turned weak in these images and became dark 
after full unloading. Similarly, grain 4, which 
rotated out of a strong diffraction condition 
during loading, rotated back into a diffract- 
ing condition during unloading. Comparing 
the diffraction contrast of the pillar before and 
after compression, the grains were found in 
general to rotate back toward their undeformed 
state during unloading. The observed unex- 
pected rotations were, however, heterogeneous 
and varied from grain to grain. For example, 
grains 2 and 3 almost reverted to their initial 
appearance in the dark-field images, whereas 
the contrast change for grain 4 was reversed in 
only part of the grain, with some residual lat- 
tice rotation remaining after unloading. For 
some grains (e.g., grains 1 and 5 in Fig. 3), a 
strong diffraction condition was maintained 
over the entire loading and unloading proce- 
dure. Additionally, the observation of constant 
diffraction contrast in some grains located 
both near to and farther from the pillar top 
demonstrates that no global rotation of the 
specimen took place during loading and un- 
loading. A separate experiment in which we 
followed the variation in intensity in diffrac- 
tion space during in situ loading and unload- 
ing of a nanograined nickel pillar sample also 
showed evidence for reversal of grain rota- 
tions upon unloading (fig. S6 and movie S2). 
Therefore, these difraction space observations 
strongly support the interpretation of the ob- 
served grain contrast changes in Fig. 3 as re- 
sulting from lattice rotations due to internal 
microstructural changes. 


Estimation of back-stress magnitude 


We could speculate that the unexpected lattice 
rotation during unloading is linked to the 
small sample size of the pillar used in our 
study, in which only a few grains were present 
in any transverse cross-section. However, a 
recovery of plastic deformation during unload- 
ing has also been observed in bulk nanograined 
nickel and iron, albeit in that case indirectly 
through in situ x-ray diffraction profile analy- 
sis (18, 24, 31). One possibility to account for 
these rotations is the back stresses arising 
from plastically inhomogeneous deformation 
(25). To investigate this possibility, we used 
Wu's method (32-34) to assess the contribu- 
tion of back stresses to the unexpected lattice 
rotations during unloading. We performed a 
further experiment on a separate pillar sample 
with similar size, in which we exposed the 
pillar to multiple loading and unloading cycles 
(Fig. 4A). We observed hysteresis loops, indi- 
cative of the presence of the back stresses, in 
the stress-strain curves for these unloading- 
reloading cycles (fig. S7). From analysis of each 
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Fig. 2. Quantification of deformation-induced crystal rotations in nanograined nickel. (A) The 
compression axis rotation of the 289 nanosized grains expressed in the unit stereographic triangle. The 
position of each circle indicates the initial crystal direction parallel to the compression axis, and the size of 
each circle is proportional to the rotation of the compression axis, with the largest circle representing 

a rotation of 8° (B) The rotation of each individual grain as predicted by a VPSC model. Each rotation is 
represented by a black arrow with the initial orientation marked with a circle. (© and D) Experimental 
observations of the rotations of the 289 embedded nanograins expressed in the stereographic triangle are 
divided into two subgroups: type A, matching the VPSC simulations (C) and type B, not matching the 
VPSC simulations (D). For clarity, the length of the arrows in (C) and (D) are drawn linearly proportional to 
the rotation for each grain but are capped at a maximum length for rotations angle >3° (for these cases, the 


initial orientation is marked with a hollow circle). 


hysteresis loop, we estimated the back stress 
(op) from the following equation: 


_ Ou t+ Or 


; (1) 


Op 
where o, and o, are the unloading yield point 
and reloading yield point, respectively. To ob- 
tain these yield point values as accurately as 
possible, we fit each hysteresis loop as a poly- 
nomial function. Our analysis of the data showed 
that the back stress o, increased markedly 
with deformation strain, accounting for an 
approximately constant fraction of 54% of 
the maximum applied stress for each loading 
cycle (Fig. 4A). Although it was not possible 
in this experimental setup to directly observe 
corresponding dislocation activity in situ dur- 
ing the loading and unloading sequence, the 
evidently high back stress revealed from the 
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cyclic loading curves was the most likely origin 
for the unexpected lattice rotation of the nano- 
sized grains. 

It has been reported elsewhere that when 
GB-mediated mechanisms dominate plastic 
deformation, the rotation rate depends on the 
grain size d, according to d”, where 7 varies 
between 2 and 5 depending on the specific 
rotation mechanism (35). Regardless of the 
value of n, such a model predicts that grain 
rotations should be more active in fine, nano- 
sized grains. However, our measurements did 
not show a size-dependent rotation, indicating 
no important contribution from GB-mediated 
processes to the lattice rotations during defor- 
mation. Moreover, classic Hall-Petch strength- 
ening in nanograined nickel has been observed 
for grain sizes down to 10 nm, and only then 
followed by an apparent softening (36). The 
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Hall-Petch breakdown is generally attributed 
to a switch in deformation mechanism from 
dislocation-mediated processes to GB sliding. 
In our experiments, however, almost all of the 
grains were >10 nm (Fig. 4E), indicating that 
the observed pattern of grain rotation is un- 
likely to be associated with GB sliding. 


Analysis of dislocation activity 


To further explore the plastic deformation mech- 
anisms, we made postmortem high-resolution 
TEM observations of the compressed micro- 
structure in the nanograined nickel pillars. As 
shown in Fig. 4B, we observed full disloca- 
tions (marked with 1) with a Burgers vector of 
a/2[110] (where a is the lattice constant of 
nickel) in the interior of the nanosized grains. 
We also observed the nucleation and propa- 
gation of partial dislocations, resulting in the 
formation of stacking faults (Fig. 4C, white : 
lines) in some of the smaller grains. Recent 
experiments have also noted the presence of 
dislocations in deformed copper (77) and nickel 
(18, 20) in grains as small as 3 nm. The presence 
of a considerable dislocation density in the 
postdeformed state indicates that deformation 
of the nanograined nickel sample is accom- 
modated by dislocation-mediated plasticity, re- 
sulting in the observed lattice rotations. 

The generation of full dislocations or partial 
dislocations from GBs is expected, therefore, 
to be a key process that governs the lattice ro- 
tation of nanograins. We can understand this 
by comparing the applied stress with the crit- 
ical stress required to activate a new disloca- 
tion from a GB source. According to dislocation 
theory (37, 38), by approximating the source 
size to be equal to the grain size (@), the critical 
stress needed to emit either a perfect full dis- 
location (t,) or a Shockley partial dislocation 
(tp) can be described as 


Gb 
ye= oa (2) 
by | 8\ y 
3d | (1 3) bp (3) 


where G is the shear modulus; b; and 0b, are the 
Burgers vectors of the full and partial disloca- 
tions, respectively; y is the stacking fault en- 
ergy; and 6 is the equilibrium partial dislocation 
spacing. For nanograined nickel, based on these 
equations, the critical grain size d, below which 
partial dislocation emission is favored over full 
dislocation emission (fig. S8A) is estimated to be 
18 nm. From the grain sizes determined from 
the 3D-OMiTEM measurements, the critical 
stress for each individual nanograin was cal- 
culated and found to vary from 0.2 GPa for the 
largest nanograins to 14 GPa for the smallest 
(fig. S8B). The variation in critical stress, com- 
bined with the resolved shear stress dependence 
on grain orientation (through the Schmid factor), 
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Loading 


Unloading 


Fig. 3. In situ TEM observation showing reversible rotations of nanosized grains. (A) Dark-field TEM 
images recorded during compression using a flat diamond indenter. The grains lit up are in either a 

{111} or a {200} strong diffraction condition. (B to D) A sequence of TEM images taken at different times 
during the loading procedure where contrast changes show rotations of the grains. (E to H) A sequence 

of TEM images showing the reverse rotation of grains during unloading. Grains 1 and 5 indicate examples 
of grains in which the diffraction contrast remains constant during the whole loading and unloading 
procedure. Grains 2, 3, and 4 represent examples of grains in which the diffraction contrast changes during 
both the loading and unloading procedures. Schematic drawings of the loading and unloading are shown 


to the right of the TEM images. 


gives rise to a considerable stress difference and 
plastic strain mismatch between “soft grains” 
(large size and high Schmid factor) and “hard 
grains” (small size and low Schmid factor) dur- 
ing deformation, resulting in the development 
of measurable back stresses. 

To better analyze the influence of these var- 
jations on the plastic deformation process, 
we defined a stress ratio R = mata, GP where 
m is the Schmid factor (fig. S8C) calculated 
assuming a {111}/<110> slip system for grains 
with sizes larger than d, and a {111}/<112> slip 
system for grains with sizes smaller than d,. For 
loading, o is taken as the maximum applied 
stress; for unloading, o is taken as the esti- 
mated back stress (1.35 GPa, corresponding to 
54% of the maximum applied stress, see fig. S9). 
As shown in Fig. 4D, dislocation-based plas- 
ticity occurred for almost all the nanosized 
grains, because 99% of the grains had a value 
of R = 1 at the end of loading. For unloading, 
we observed that 47% of the grains had R = 1, 
approximately in agreement with the number 
of grains showing with a rotation opposite to 
the VSPC model prediction (Fig. 2D). Plotting 
the calculated R values for unloading as a func- 
tion of grain size (Fig. 4E), we found that the 
datasets for R < 1and R = 1 only exhibited a small 
overlap. We only found grains with R < 1 for 
sizes <40 nm and values of R = 1 for grains 
>20 nm, supporting a size dependence for lat- 
tice rotation upon unloading. 
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Although the classical dislocation model that 
we used here is very simplified, the agreement 
between the calculated results and the ex- 
perimental observations is quite satisfactory. 
Therefore, the observed unexpected lattice ro- 
tations can reasonably be attributed to the 
effect of released back stress during unload- 
ing. This back stress is of sufficiently high val- 
ue in many nanosized grains, in particular in 
the larger-size nanograins, to promote dislo- 
cation activity, thereby resulting in lattice 
rotations in the nanograined pillar sample 
during unloading. A large Bauschinger effect 
has also recently been reported in samples of 
Al with average grain size of up to 0.7 um (39), 
raising the possibility that similar phenomena 
on unloading may be observed, albeit to a lesser 
extent up to the submicrometer grain scale. It 
can be noted that structural transformations 
of GBs into low-energy states can also be trig- 
gered by the emission of partial dislocations 
during straining (11, 40, 41), with the result 
that the relaxed GBs become more stable against 
mechanical activation during unloading. Such 
a process is expected to contribute to the in- 
hibition of lattice rotations in grains with size 
less than d, (Fig. 4E). 

Our experimental analysis suggests that the 
unexpected crystal lattice rotation during un- 
loading results from the strong heterogeneity 
in deformation of nanograin ensembles and 
leads to the development of back stresses that 
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can drive strain recovery. We expect these ob- 
servations to have direct relevance for strain 
engineering and for microscale and nanoscale 
electromechanical systems fabricated from 
metals, for which rigorous dimensional con- 
trol is important. Our results also emphasize 
the importance of mapping the microstruc- 
ture nondestructively in 3D at the nanoscale 
for understanding plastic deformation in nano- 
materials. The 3D-OMiTEM method used here 
is applicable to a wide range of nanograined 
metals and can be used to simultaneously map 
the crystallographic and morphological char- 
acteristics of nanograin ensembles with initial 
size as small as a few nanometers, allowing a 
statistical analysis of their evolution during 
deformation (or exposure to other external 
fields such as heating). Furthermore, such 3D 
measurements provide data in a format that 
can be directly coupled with advanced plastic- 
ity models such as molecular dynamics simu- 
lations and crystal plasticity finite element 
modeling to promote a deeper understanding 
of the 3D microstructural evolution and plas- 
ticity in nanograined metals. 
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Fig. 4. Deformation A4 
mechanisms of nano- 
grained nickel. 

(A) Repeated loading 
and unloading curves for 
a nanograined nickel pillar 
(left) and the measured 
back stress and applied 
stress and back stress 
versus strain showing the 
mean and SD obtained 0 
from repeated fitting of 
each loop (N = 3) (right). 
(B and C) Typical high- 
resolution TEM images of 
compressed nanograins 
taken along a <110> zone 
axis, showing a full dislo- 


True stress (GPa) 


A 
0.00 0.05 0.10 0.15 
True strain 


“ 
whe 
yA 


with a Burgers vector b of 
3 <110> and a stacking 
fault (marked with a 
white line) produced by 
the motion of Shockley 
partial dislocations with 
Burgers vector b of 


z <112>. (D) Distribution 


LW ESS 
cation (marked with 1) \\ iN \\\\ \ : t 7” 


Full dislocation 


2.4 60 p30 
Ss 
D> 2 
29 % 
- 55 8 
S jFeit +| ten 
20 => 
no — er 
o 502 § 15 
0148 ws 
= no go 
S 2 i 10 
co 45 3 
164 2 
Co 
Oo 
a 
1.4 40 
0.20 0.06 0.09 0.12 0.15 0.18 0.21 
True strain 


‘ 
\ 


Frequency (%) 


u 


dinms 


~~ Partial dislocation _ 


Loading Unloading 

_ __™Mo mo, 
ee R = —— 
min(7;,7,) min( 7,7) 


0 10 20 30 40 50 60 70 80 
of the parameter nes 
R(= —— ) for each Grain size (nm) 
T Tp 
grain during loading and unloading. (E) Corresponding histogram showing the variation of R with grain size for unloading. s 


34. M. Yang, Y. Pan, F. Yuan, Y. Zhu, X. Wu, Mater. Res. Lett. 4, 

45-151 (2016). 

35. M. Upmanyu, D. J. Srolovitz, A. E. Lobkovsky, J. A. Warren, 

W. C. Carter, Acta Mater. 54, 1707-1719 (2006). 

36. J. Hu, Y. N. Shi, X. Sauvage, G. Sha, K. Lu, Science 355, 

292-1296 (2017). 

37. R. J. Asaro, P. Krysl, B. Kad, Philos. Mag. Lett. 83, 733-743 (2003). 

38. R. J. Asaro, S. Suresh, Acta Mater. 53, 3369-3382 (2005). 

39. S. Gao, K. Yoshino, D. Terada, Y. Kaneko, N. Tsuji, Scr. Mater. 
211, 114503 (2022). 

40. X. Li, K. Lu, Science 364, 733-734 (2019). 

4l. X. Zhou, X. Y. Li, K. Lu, Science 360, 526-530 (2018). 


ACKNOWLEDGMENTS 


We thank N. Hansen for fruitful discussions. Funding: X.X.H., 
G.L.W., and A.G. were supported by the National Key Research and 


He et al., Science 382, 1065-1069 (2023) 


Development Program of China (grant 2021YFB3702101) and the 
National Natural Science Foundation of China (grants 52071038 and 
52130107). Q.Y.H. was supported by the Natural Science 
Foundation of Chongqing (grant cstc2021jcyj-msxmX1185). D.J.J. 
was supported by Villum Fonden (grant MicroAM VIL 54495). 
Author contributions: G.L.W., A.G., and X.X.H. conceived the 
concept and designed the experiments. X.X.H., G.L.W., S.S., A.G., 
and D.J.J. supervised the project. Q.Y.H. prepared the samples. 
Q.Y.H., W.Q.Z., G.L.W., T.LH., and Z.Q.F. conducted the TEM 
characterization. S.S. led development of the software for orientation 
mapping. Q.Y.H., S.S., and W.Q.Z. performed the 3D-OMiTEM 
measurements and analyzed the data. Q.Y.H. and L.Z. performed the in 
situ TEM experiments. Q.Y.H. and G.L.W. performed the VPSC 
modeling. Q.Y.H., G.L.W., D.JJ., A.G., and X.X.H. wrote the manuscript. 
All authors contributed to discussions of the results and reviewed 
the manuscript. Competing interests: The authors declare no 


1 December 2023 


competing interests. Data and materials availability: All data are 
available in the main text or the supplementary materials. License 
information: Copyright © 2023 the authors, some rights reserved; 
exclusive licensee American Association for the Advancement of 
Science. No claim to original US government works. https://www. 
science.org/about/science-licenses-journal-article-reuse 


SUPPLEMENTARY MATERIALS 
science.org/doi/10.1126/science.adj2522 

Materials and Methods 

Figs. S1 to S10 . 
References (42-46) C 
Movies S1 and S2 


Submitted 15 June 2023; accepted 2 November 2023 
10.1126/science.adj2522 


5 of 5 


RESEARCH 


HUBBARD MODEL 


The Wiedemann-Franz law in doped Mott insulators 


without quasiparticles 


Wen O. Wang?*, Jixun K. Ding'?, Yoni Schattner?**, Edwin W. Huang?’*”, 


Brian Moritz, Thomas P. Devereaux?®?* 


Many metallic quantum materials display anomalous transport phenomena that defy a Fermi liquid 
description. Here, we use numerical methods to calculate thermal and charge transport in the doped 
Hubbard model and observe a crossover separating high- and low-temperature behaviors. Distinct 
from the behavior at high temperatures, the Lorenz number L becomes weakly doping dependent and 
less sensitive to parameters at low temperatures. At the lowest numerically accessible temperatures, 
L roughly approaches the Wiedemann-Franz constant Lo, even in a doped Mott insulator that lacks 
well-defined quasiparticles. Decomposing the energy current operator indicates a compensation between 
kinetic and potential contributions, which may help to clarify the interpretation of transport experiments 
beyond Boltzmann theory in strongly correlated metals. 


andau’s notion of quasiparticles greatly 

simplified the language of transport in 

systems with a macroscopic number of 

interacting degrees of freedom in terms 

of “free” objects with renormalized prop- 
erties that participate in transport through 
a semi-classical or Boltzmann framework. As 
such, transport behavior of Fermi liquids is 
governed solely by kinematic constraints of a 
Fermi surface and collisions between other- 
wise free particles. Yet in many correlated me- 
tals, including the high-transition temperature 
(or critical temperature, 7.) cuprates, anoma- 
lous transport phenomena have been uncov- 
ered that violate these rules: strange metal 
resistivity that increases linearly with temper- 
ature, not saturating as the quasiparticle mean- 
free-path approaches the lattice spacing (7-3); 
inconsistency with Kohler’s rule, which gov- 
erns the scaling behavior of magnetoresistance 
from Boltzmann theory (4-6); and violations 
of the Wiedemann-Franz law, which constrains 
the ratio of thermal to electrical conductivity 
(7-18). 

The ubiquity of behavior that violates no- 
tions of the Fermi liquid has led to tremendous 
interest in determining how heat and charge 
currents propagate in such systems (19-23). 
Analysis of the large body of experimental 
transport results in correlated materials has 
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been hindered by the use of an assumed 
Boltzmann-like theory and reductive conclu- 
sions on the nature of transport in terms of 
Drude-like single-particle concepts. This great- 
ly amplifies the need for deeper analysis that 
avoids oversimplifications, but there is very 
little known from exact methods about the 


q 


nature of transport in strongly interac Chee 
systems. Many advanced numerical calc j 
tions have focused on characterizing ground- 
state properties (24, 25), but a picture of 
transport is incomplete without an understand- 
ing of the excited states in these materials. 
Analytical approaches are hampered by the fact 
that properly evaluating transport involves 
calculating many higher-order correlation func- 
tions without relying on the simplifying assump- 
tions of quasiparticles and Boltzmann theory, 
which only punctuates the need for more accu- 
rate and precise determinations of transport. 
Here, we numerically study the DC longitu- 
dinal thermal conductivity « in the doped two- 
dimensional (2D) ¢t — t’— U Hubbard model, 
which exhibits strange metallic electric trans- 
port over a wide hole doping p and tempera- 
ture T range (26-29). We evaluate the many-body 
Kubo formula using the determinant quan- 
tum Monte Carlo (DQMC) (30, 37) algorithm, 
which is numerically exact, unbiased, and non- 
perturbative, and maximum entropy analy- 
tic continuation (MaxEnt) (32, 33), which is 
typically reliable in systems with strong in- 
teractions that lack sharp features in fre- 
quency [see supplementary materials of (26)]. ‘ 
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Fig. 1. Temperature and doping dependence of thermal and charge conductivity. (A) DC thermal 
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conductivity «. (B) «/T focused on the low-temperature regime. (€) DC charge conductivity o multiplied by 
temperature 7. (D) o focused on the low-temperature regime. The high-temperature dotted lines in (A) and 
(C) are infinite-temperature limits calculated via a moments expansion (26, 35). Parameters: U/t = 8 and 
t'/t = —0.25. A crossover temperature T,. ~ t separates low- and high-temperature regimes in (A) and (C). 
Error bars are shown but may be smaller than the size of the data markers. 
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We define x as the linear response of the heat 
current (Jg) induced by a parallel tempera- 
ture gradient and normalized by system size 
N, « = —(Jazx)/(NOrT), under the condition 
of zero charge current. Distinct from the in- 
coherent behavior at high temperatures, we 
observe that the Lorenz number, the ratio 
between the thermal and charge conductivity 
L=x/(To), has a weak doping and parameter 
dependence in the low-temperature regime 
and roughly approaches the Wiedemann-Franz 
law prediction Ly = n?/3 as temperature de- 
creases down to the lowest accessible value, even 
in the absence of long-lived quasiparticles. Meth- 
odological details, including a systematic anal- 
ysis of finite size and Trotter errors, as well as 
extensive supporting data, can be found in (34). 


Thermal and charge conductivity 


The DC longitudinal thermal conductivity «(T) 
is shown in Fig. 1A; for comparison, the DC 
longitudinal charge conductivity o( 7’) (26) (mul- 
tiplied by T) is shown in Fig. IC. In the infinite- 
temperature limit, « <1/T? and o « 1/T 
(26, 27, 35, 36). As T decreases from this limit, 
we observe a crossover at roughly T;.~t, sep- 
arating distinct behavior in two regimes for 
both « and o. « decreases with doping at high 
temperatures, whereas it increases with doping 
at low temperatures. Although o generally in- 
creases with doping at all temperatures, the tem- 
perature dependence of To displays kinks, or 
even nonmonotonic behavior, at roughly Tyo. 
Below Tyo, «/T and o display similar doping and 
temperature dependences (Fig. 1, B and D), sug- 
gesting persistent correlations between thermal 
and charge transport even for a strange metal 
phase where quasiparticles are not well-defined. 


Lorenz number and its temperature and 
parameter dependence 


The Lorenz number L(7) highlights the cor- 
relation between thermal and charge trans- 
port (Fig. 2). Aside from the half-filled Mott 
insulator, where L diverges with decreasing 
temperature, in the doped metals L shows a 
crossover similar to that in « and o. At high 
temperatures, high-energy excited states be- 
come important (36, 37), such that quasi- 
particles are not well-defined and electrons 
have extraordinarily short mean-free-paths. 
LI has a nonmonotonic temperature depen- 
dence and decreases with increasing doping. 
Below T,., L displays substantially reduced 
doping dependence, collapsing roughly onto 
a single set of curves. This set of curves in- 
creases monotonically with decreasing tem- 
peratures, approaching a constant that roughly 
corresponds to Lo = 1” /3—the Lorenz num- 
ber as predicted by the Wiedemann-Franz law. 

In the Hubbard model, relaxation primarily 
occurs through Umklapp scattering. To test 
its impact on the conductivities and L, we 
modulate Umklapp scattering by modifying 
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Fig. 2. Lorenz number. 
Symbols: Calculated 

L = «/(To) normalized 
by Lo. The lines are guides 
to the eye. At low tem- 
peratures, below Tyo ~ t, 
L/Lo approaches roughly 1 
marked by the black star. 
Parameters: U/t = 8 and 
t'/t = —0.25. Cartoons: 
At high temperatures, 
high-energy excited states 
are important (36, 37) and 
transport is incoherent; 
electrons are strongly 
correlated and have an 
extraordinarily short 
mean-free-path. At low 
temperatures, the elec- 
trons are on their way 
toward some sort of 


L/Lo 


“coherence”: electrons 0.099 
have a longer mean-free- 
path, although not long 
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enough for well-defined long-lived quasiparticles. Although single-particle and individual transport properties 
show signatures of anomalous strange metal and non-Fermi liquid behavior, the Lorenz number still roughly 
approaches the Wiedemann-Franz law's prediction as temperature decreases. 
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Fig. 3. Parameter dependence of the Lorenz number. L /Lo for (A) U/t = 6 and t’/t = —0.25; (B) U/t = 6 
and t’/t = 0; (C) U/t = 8 and t’/t = 0; (D) U/t = 10 and t’/t = 0. The black stars mark the value 1. 
The lowest temperatures are lower for smaller U owing to a better behaved fermion sign. 
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Fig. 4. Kinetic and potential decomposition of the Lorenz number. (A) Normalized kinetic contribution 
Lx /Lo. The black star marks the value 1. (B) Normalized potential contribution Lp /Lo. The black dotted line marks 


the value 0. Parameters: U/t = 8 and t’/t = —0.25. 


the Hubbard U and next-nearest-neighbor 
hopping ¢’, with the results shown in Fig. 3. 
The high-temperature peak position of L is 
largely controlled by U, increasing with in- 
creasing U, similar to the behavior of the spe- 
cific heat [see fig. S9 in (34)]. For temperatures 
below the crossover, there is no strong de- 
pendence of L on either U or t’, suggesting that 
the low-temperature behavior is generic to the 
strongly correlated Hubbard model: Chang- 
ing the shape of the Fermi surface (¢’) or the 
strength of Umklapp scattering (UV) does not 
appreciably alter L at the temperatures ac- 
cessible through DQMC. 


Decomposing the Lorenz number 


To better understand the behavior below T,,, it 
is useful to look at the operator contributions to 
the conductivities. Determining « in the Hubbard 
model using the Kubo formula requires one to 
consider the two-particle term in the energy 
current operator arising from electron-electron 
interactions, as opposed to Boltzmann theory 
that relies entirely on single-particle proper- 
ties. The energy current operator J; consists 
of a single-particle kinetic energy contribution, 
Jx, similar to that appearing in the charge cur- 
rent operator J, plus an additional term Jp, 
which we call the potential energy current that 
depends explicitly on the interaction and no- 
tably contains a two-particle current [see eq. S2, 
eq. S3, and the relevant discussion of the For- 
malism in (34)]. The heat current Jg, from 
which we obtain k, itself contains an additional 
term —uJ, where u is the chemical potential. 
However, under the condition of zero charge 
current ((J) = 0), terms proportional to (J) will 
not contribute to (Jg), leaving only (Jx) and 
(Jp). In this way, we separate « into kinetic 
and potential contributions «x /p = —(VJx/p,x) / 
(NO,T). Similarly, we can express the Lorenz 
number L as a sum of its kinetic and poten- 
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tial contributions, with L = Lx + Lp, where 
Lx) = Kx/p/(To) (Fig. 4, A and B). 

At high temperatures, the kinetic energy 
contribution Lx is relatively small and doping 
independent, whereas the potential energy 
contribution Lp is large at small doping and 
decreases for increasing hole concentration 
owing to the reduction of double occupancies. 
This doping dependence is imparted to the 
combined L (as already shown in Fig. 2). Be- 
low the crossover temperature T,, and at large 
doping, Lp is relatively small and Lx and L 
approach Lp. At low doping, Lx increases with 
decreasing temperature, while Lp decreases 
and changes sign at roughly 7;,.. The separate 
contributions from the kinetic and potential 
terms show opposing behavior, which be- 
comes more pronounced for lower doping, and 
effectively compensate one another, result- 
ing in L that approaches Lo. Thus unexpected- 
ly, the ratio L displays a relative insensitivity 
to doping, and Hubbard model parameters 
[see fig. S10 in (34)], at the lowest accessible 
temperatures. 


Discussion and outlook 


The congruence between charge and thermal 
transport in the Hubbard model is unexpected. 
For scattering dominated by elastic processes, 
such as disorder or quasi-elastic phonon scat- 
tering above the Debye temperature, the thermal 
and charge conductivity are correlated through 
the Wiedemann-Franz law (13, 21, 38, 39), such 
that for 7 much lower than the Fermi temper- 
ature, one obtains the Lorenz number L = 
Lo = 1°/3. For both Fermi liquids and non- 
Fermi liquids without disorder, L deviates 
substantially from Lo (21, 39, 40). Despite our 
lack of knowledge about the exact behavior 
of the Hubbard model at lower temperatures 
(Fermi liquid or not), caused by the fermion 
sign problem, the result that Z approaches 
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a weakly doping and Hubbard parameter- 
dependent constant very close to Lo indicates 
a surprisingly universal behavior. This behavior 
is observed only when both single- and two- 
particle contributions are properly accounted 
for in the heat-current operator. 

Our results may be understood in three pos- 
sible ways. First, although the temperatures in 
our study are below the magnetic exchange 
energy scale J, our results may not yet be in the 
asymptotic low-temperature regime to assess 
the T—0 limit. Second, one might expect the 
approximate Wiedemann-Franz ratio to emerge 
in a system where both charge and thermal 
currents relax predominantly through Umklapp 
scattering in our temperature regime. Lastly, 
it may be that such a compensation effect be- 
tween kinetic and potential energy contributions 
to L cannot be cast in the usual Boltzmann- 
like formulation for strongly interacting, aniso- 
tropic systems such as the Hubbard model. 

Finally, what can our results say about the 
strong violation of the Wiedemann-Franz law 
that has been observed in cuprates at room 
temperature, with L larger than Lo by a factor 
of 3 or more (7, 10, 18, 38)? One explanation 
for this is that the strong interaction enhances 
the electronic contribution to thermal trans- 
port, whereas another explanation would rely 
on a substantial phonon contribution to the 
heat current. Our observation over the exper- 
imentally relevant temperature range that the 
electronic contribution L roughly approaches 
LI from below would be consistent with sce- 
narios in which the large L in cuprates re- 
quires an appreciable phonon contribution 
to heat transport. 
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Divergent molecular networks program functionally 
distinct CD8* skin-resident memory T cells 


Simone L. Park’*++, Susan N. Christo", Alexandria C. Wells”, Luke C. Gandolfo"*, Ali Zaid’, 
Yannick O. Alexandre’, Thomas N. Burn’, Jan Schréder’, Nicholas Collins?, Seong-Ji Han’, 
Stéphane M. Guillaume’, Maximilien Evrard’, Clara Castellucci’, Brooke Davies’, Maleika Osman’, 
Andreas Obers!, Keely M. McDonald, Huimeng Wang’, Scott N. Mueller’, George Kannourakis>*, 
Stuart P. Berzins*>, Lisa A. Mielke’, Francis R. Carbone’, Axel Kallies’, Terence P. Speed**, 


Yasmine Belkaid*, Laura K. Mackay"* 


Skin-resident CD8* T cells include distinct interferon-y—producing [tissue-resident memory T type 1 
(Trm1)] and interleukin-17 (IL-17)-producing (Tay17) subsets that differentially contribute to immune 
responses. However, whether these populations use common mechanisms to establish tissue residence 

is unknown. In this work, we show that Tay and Tryl7 cells navigate divergent trajectories to acquire tissue 
residency in the skin. Tpyl cells depend on a T-bet—Hobit-IL-15 axis, whereas Tpy17 cells develop 
independently of these factors. Instead, c-Maf commands a tissue-resident program in Try17 cells parallel to 
that induced by Hobit in Tpyl cells, with an ICOS-c-Maf-IL-7 axis pivotal to Tay17 cell commitment. 
Accordingly, by targeting this pathway, skin Tpyl17 cells can be ablated without compromising their Trl 
counterparts. Thus, skin-resident T cells rely on distinct molecular circuitries, which can be exploited to 


strategically modulate local immunity. 


xposure to pathogens and commensal 
microbes drives the accumulation of non- 
recirculating T cells at barrier surfaces 
(1, 2). These tissue-resident memory T 
(Tpm) cells are distinct from circulating 
T (Tcrrc) cells and are critical for imparting 
local immune protection (3-5) and orches- 
trating tissue repair (6, 7). However, CD8* Tay 
cells have also been implicated in autoim- 
mune pathology (8). Understanding how Tay 
cells are generated and maintained is there- 
fore relevant across a variety of diseases. 
Although the factors that control tissue resi- 
dency vary across organs (4), the shutdown of 
tissue egress and induction of retention are 
universal (9), with the transcription factors 
(TFs) Hobit, Blimp1, and Runx3 playing pivotal 
roles in these processes (10, 11). However, recent 
work has pointed to uncharted heterogeneity 
within the tissue-intrinsic CD8* Txy, cell pool 
(12-14). In addition to interferon-y (IFN-y)- 
producing Try cells (Tpy1 cells)—which have 
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been well studied in the context of infection— 
functionally distinct Tp» cells producing 
interleukin-17 (IL-17) (Taw17 cells) coinhabit 
human skin (/4). Trl cells are associated 
with viral and tumor control (73), whereas skin 
Trul7 cells promote bacterial defense and 
wound healing (6, 7, 14). Moreover, Tay and 
Trml7 cells differentially contribute to skin 
pathologies, including vitiligo and psoriasis, re- 
spectively, where they persist long after disease 
resolution (13, 15-18). Thus, devising strategies 
to target each subset may enable deliberate tun- 
ing of the skin immune repertoire in a manner 
tailored toward specific disease contexts. 

In this work, we report that CD8* Tpyy1 and 
Trml7 cells engage distinct molecular networks 
to establish tissue residency in the skin. We 
show that targeting factors specifically required 
by either subset enables strategic modulation of 
the immune landscape, unlocking avenues to 
temper regional immunity. 


Distinct populations of CD8* Try cells 
occupy the skin 


Human skin contains CD69*CD103*CD8* Taw 
cells that produce either IFN-y (TRyl cells) or 
IL-17A (Tpml7 cells) (Fig. 1, A and B, and fig. 
S1A). However, studies identifying factors that 
govern skin Try cell development have fo- 
cused on Tpyl cells (10, 19, 20), leaving the 
unanswered question of whether Tpy17 cells 
rely on similar factors to establish residency. To 
address this, we profiled the spectrum of func- 
tional states adopted by skin CD8* Try, cells 
responding to various stimuli. After viral skin 
infection, cutaneous melanoma (27), or hapten- 
induced inflammation (22), CD8*CD44”™ Tay 
cells produced IFN-y but negligible IL-17A 
upon stimulation. By contrast, exposure to the 
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commensal bacterium Staphylococcus epidern Chee 
: ‘ied ~ | upde 
generated functionally distinct populatior..-—— 
CD8* Tal and Tpyl7 cells, mirroring the ar- 
ray of CD8* T cells in human skin (Fig. 1, A to 
C, and fig. S1, A to G), which provides oppor- 
tunities to explore whether the developmental 
pathways of these cells overlap or diverge. 

Expression of CD49a delineates Tl cells 
in human skin (73). We accordingly found that 
regardless of the mode of induction, Tpy1 but 
not Tpy17 cells in mice expressed CD49a and the 
TF T-bet, whereas Tpyj17 cells expressed T helper 
17 (Ty17) lineage-defining molecules CCR6 and 
RORyt (Fig. 1D and fig. S1, D, E, and H to K). 
Despite these differences, both T-bet* (Tpwl1- 
like) and RORyt* (Tpy17-like) cells localized to 
the epidermal layer of the skin (Fig. 1E and fig. S1, 
Land M). Nevertheless, Tpy,17 cells deviated from 
Tro cells in expression of T cell state-specifying 
molecules (Cx3cri1, Foxol, Id3, Slamf7, Tcf7, and : 
Tox) as well as reduced expression of master 
residency TFs Z/p683 (Hobit) and Runx3 (Runx3) 
(Fig. 1, F to I, and fig. S2, A and B). Moreover, 

S. epidermidis association of Hobit and Blimp1 
reporter mice confirmed that whereas both Tay 
subsets expressed Blimp1, Hobit and Runx3 
were highly expressed in skin Tpyyl cells but ‘ 
were absent from Tpy17 cells (Fig. 11). 

We consequently wondered whether Tayl 
and Tpmil7 cells could establish residency with 
the same propensity. Parabiosis experiments ‘ 
confirmed that both subsets were overwhelm- 
ingly nonrecirculating (Fig. 1, J and K, and fig. 
83, A and B). However, tracking polyclonal and 
commensal-specific (7) Tpyyl and Tpyyl7 cells 
revealed that Ty cells were stably maintained, 
whereas Tpyj17 cell numbers declined fourfold 
2 to 10 weeks postassociation (Fig. 1L and fig. 
83, C and D). Thus, both Tpyq1 and Tpyy17 cells 
are bona fide residents, but skin Tpyl cells 
exhibit superior longevity compared with Tayl7 - 
cells with identical antigen specificity. c 

S. epidermidis persists as part of the skin flora 
but declines in abundance over time (/4). We 
reasoned that curtailed Tpy17 cell maintenance 
may reflect a differential requirement for antigen 
persistence. We associated bone marrow (BM) 
chimeras of wild-type (WT) and T cell receptor 
(TCR)-deficient (Tract"™xRosa26r ER /*) 
and control cells with S. epidermidis to establish 
Trml and Tpy17 populations and then admin- 
istered tamoxifen to induce TCR deletion (fig. 
S3E). Although tamoxifen treatment induced 
the ablation of TCR expression on Trac®"™*x 
Rosa26vER™/* T cells, Tang or Taal7 popula- 
tions were both maintained equally to those in 
controls, which implies that persistence of either 
subset was antigen agnostic (Fig. 1M and fig. S3, 
F to K). Trl cells displayed elevated levels of 
Bcl-2 and Ki67 compared with Taw17 cells (Fig. 
1N), which suggests that Tpy1 cells may main- 
tain a proliferation or survival advantage. These 
findings suggested that Tpyy1 and Tpw17 cells 
are differentially regulated and, moreover, that 
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Fig. 1. CD8* Tal and Tpy17 cells are differentially regulated in the skin. 
(A) Frequency of IFN-y— and IL-17—-expressing Try cells in healthy human skin 
or the flank skin of C57BL/6 mice >2 weeks after herpes simplex virus (HSV) 
infection or topical S. epidermidis association. (B and C) Proportion of 
IL-I7A*IFN-y~ human (B) or mouse (C) skin Try cells >2 weeks after indicated 
manipulations. VACV, Vaccinia virus; DNFB, 2,4-dinitrofluorobenzene. (D) Marker 
expression by skin CD8* IFN-y*IL-17A~ (Trl) or IFN-y"IL-17A* cells (Tpxl7) 

>2 weeks after S. epidermidis. (E) Representative intravital two-photon microscopy 
images of skin from Tbet-ZsGreen.RORytE2-Crimson reporter hosts >2 weeks after 
introduction of S. epidermidis. T-bet is red, RORyt is cyan, and second-harmonic 
generation (SHG) is white. Scale bar, 30 um. HF, hair follicle. (F) Differentially expressed 
genes between skin CD8*CCR6™ (Tem) and CD8*CCR6* (Trvl7) cells. (G and 

H) Expression of indicated molecules by skin or ear pinnae Tpyl or Tryl7 cells from 
C57BL/6 (G), Hobit-tdTomato (Tom) (H), or Blimp-Tom reporter mice compared 
with Tere cells (Spl) or WT skin CD8* T cells (Ctrl) >2 weeks postassociation. 

(I) Geometric mean fluorescence intensity (gMFI) of indicated molecules. (J and 
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K) Proportion (J) and representative frequency plots (K) of CD8*CD44" splenic 

T cells or skin Trl and Tryl7 cells derived from host (CD45.2") or partner (CD45.1") 
S. epidermidis—associated mice 4 weeks after parabiosis. (L) Number of total 
(solid) or f-MIIINA:H2M3 tetramer* (dashed) Tal and Teyl7 cells isolated from the 
skin and ear pinnae. Statistical testing compares indicated time points with 2 weeks 
postassociation. (M) Mixed BM chimeras reconstituted with Tracl"Rosazere2/* 
and WT cells were treated with tamoxifen (Tam) or vehicle (Ctrl) >2 weeks after 
S. epidermidis association. Shown is the ratio of TCRB* (Ctrl) or TCRB” (Tam) 
Trac"/"xCd4-cre-Et2"’* to WT skin Tay cells normalized to the spleen 4 weeks 
post-Tam. (N) Marker expression by skin CD8* Tpyl and Teyl7 cells 2 weeks 
postassociation. Data are representative of eight donors [(A) and (B)] or are 
representative of [(A) to (D), (G), (H), and (K)] or are pooled from [(B), (J), (L), 
(M), and (N)] two to three independent experiments with n = 8 to 10 [(A) to (D), 
(G), (H), (L), and (N)], 1 = 5 parabiotic pairs [(J) and (K)], n = 14 to 15 (M), or 
n= 8(N) mice per group. *P < 0.05; **P < 0.01; n-s., not significant; Mann-Whitney 
U test. Bars represent means, and symbols represent individual mice. 
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Trml7 cells may engage noncanonical pathways 
to establish tissue residency. 


Trl and Tpyl7 cells display divergent 
developmental trajectories 


We investigated whether distinct molecular 
programs may underpin Tpyyl versus Tpyy17 cell 
development. Because Tpw17 cells expressed 
comparatively reduced levels of T-bet and Hobit 
(Fig. 11), we examined whether they were differ- 
entially regulated by T-bet, which promotes Hobit 
expression and is critical for viral-specific Ty41 
cells (10, 23). Loss of T-bet in S. epidermidis- 
associated mice led to a reduction in polyclonal 
and f:MIIINA:H2-M3* Twi cell numbers but 
increased numbers of Tay17 cells (Fig. 2, A and 
B, and fig. S4, C to E). Enhanced IL-17A produc- 
tion by T-bet-deficient Ty, precursors occurred 
as early as 1 week after S. epidermidis associa- 
tion (fig. S4F). By contrast, the combined loss of 
T-bet and Eomes further enhanced TRy17 cell 
generation but resulted in a partial recovery of 
IFN-y production by skin CD8* T cells (fig. S4G). 
NFIL3 can interact directly with T-bet (24) 
and suppresses Tj17 cell development (25). Al- 
though both CD8* skin Tyy1 and Tpy17 cells 
up-regulated NFIL3 compared with Tec cells, 
the loss of NFIL3 led to a selective reduction in 
Tru cell numbers alone (fig. S4, H to L). 
Residual T-bet expression is essential for skin 
Tru cell responsiveness to IL-15 (23). Because 
Trml7 cells developed independently of the 
T-bet-Hobit axis, we explored whether they 
might also differ in reliance on IL-15. In agree- 
ment with findings made in the context of in- 
fection (19, 23), IL-15 deficiency led to a marked 
decline in skin CD8* Tpyyl cells after S. epidermidis 
association, but skin Tgy17 cell numbers remained 
unchanged (Fig. 2, C and D, and fig. S5). By 
contrast, both skin Tpyyl and Tpyl7 cell fre- 
quencies were reduced in the absence of trans- 
forming growth factor-f (TGF-B) signaling, and 
forced TGF-f signaling selectively enriched 
Trol7 cell frequencies (fig. S6, A to H). 
Although both Tpyq1 and Tpyy17 cells expressed 
the IL-15 receptor B chain (CD122), Tpy17 cells 
additionally up-regulated IL-7 receptor a chain 
(CD127) (fig. S7, A and B). To determine wheth- 
er this translated to an increased dependence 
on IL-7, we blocked IL-7R signaling during Tay 
cell generation and maintenance phases and 
observed a marked reduction in T,)417 compared 
with Trl cells (Fig. 2, E and F, and fig. S7, C to 
E). Thus, skin Tay cells require IL-15 for survival, 
whereas skin Tpw17 cells rely on IL-7. Discrete 
skin Try cell subsets therefore engage distinct 
developmental circuitries, with only Tay cells 
using the canonical T-bet-Hobit-IL-15 axis. 


Independently aligned Hobit- and 
c-Maf-centered axes command Tpyl 

and Tpyl7 commitment 

Despite differences in regulation, skin Tpyl 
and Tpml7 cells shared overwhelmingly similar 
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transcriptomes compared with Tec cells (Fig. 
3, A and B). We therefore sought to identify al- 
ternate factors that might enforce residency 
in Tpml7 cells in the absence of cardinal reg- 
ulators Hobit and Runx3. Of the genes dis- 
tinguishing Tay and Tpyl7 cells (Fig. 1F), 157 
were designated as TFs or had DNA binding 
potential. Those up-regulated in Tpy17 cells 
included skin IL-17-producing CD8* T (Tc17) 
cell fate specifiers (e.g., Rorc and Gata3) (6), 
factors promoting Tp», cell formation (e.g., Ah7, 
Prdml, and Nr4al) (10, 26, 27), and others with 
no established role in Txyy cell biology (e.g., HIf, 
Tkzf3, and Maf) (fig. S8A). To identify putative 
regulators of Tp17 cells, we performed STRING 
network analyses on TFs enriched in Tay or 


Trml7 cells to rank and predict TFs at the nexus 
of each subset-specific landscape. The Taml- 
specific TF network prominently featured the 
T-bet-Hobit axis (Fig. 3, C and D, and fig. S8B). 
Of TFs ranked most highly in Tay17 cells Foxp3, 
Maf, and Gata3), c-Maf was the most selectively 
up-regulated in Tay17 cells compared with Tay 
cells at the protein level (fig. S8, B to D). 

To explore whether c-Maf might substitute 
for residency drivers in Tpy17 cells, we com- 
pared established Hobit-knockout (KO), BlimpI- 
KO, and Maf-KO immune signatures (J0, 28) 
with skin Tpy1 and Tpy17 cell transcriptomes 
(7). Hobit-regulated genes were enriched in 
skin Tal cells (e.g., Itgal, Runx3, and Slam/f7), 
whereas c-Maf-regulated genes were enriched 
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Fig. 2. Divergent molecular requirements for skin Tpyl and Tyyl7 cell development. (A and B) 
Tbx21-’~ or WT mice were associated with S. epidermidis. Shown is the number of CD8* Try cells 
(CD69*CD103") isolated from the skin >2 weeks postassociation (A) and their expression of IFN-y and IL-17A 
(B). (C and D) //15-’~ or WT mice were associated with S. epidermidis. Shown is the number of CD8* Taq 
cells isolated from the skin >2 weeks postassociation (C) and their expression of IFN-y or IL-17A (D). 


symbols represent individual mice. 
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(E and F) WT mice were treated with anti-IL-7Ra antibody or phosphate-buffered saline (PBS) daily for 

1 week commencing >4 weeks after association with S. epidermidis. Shown is the number of CD8* Tray 
cells isolated from ear pinnae 1 week after commencing anti-IL-7Ro treatment (E) and their expression of 
IFN-y or IL-17A (F). Data are pooled from three [(A), (B), (E), and (F)] or four [(C) and (D)] independent 
experiments with n = 12 [(A) and (B)], n = 16 [(C) and (D)] or n = 12 [(E) and (F)] mice per group. 

Skin Tay enumeration is combined from two ear pinnae (E) or 3 cm? of associated flank skin [(A) and (C)]. 
**P < 0.01; ****P < 0.0001; n.s., not significant; Mann-Whitney U test. Bars represent means, and 
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Fig. 3. Distinct molecular axes control skin Try and Tryl7 cell commit- 
ment. (A) Principal components analysis (PCA) projection of skin Tayl 
(CD8*CCR6") and Tryl7 (CD8*CCR6") cells, secondary lymphoid organ CD8* Tey 
(Teirc) cells, and naive CD8* T cells. (B) Venn diagram of Trl and Taml7 
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cells compared with Tic cells (red, up-regulated; blue, down-regulated in Tri 
cells). (© and D) STRING network analyses of TFs selectively up-regulated in Tayl 
(C) or Tpyl7 (D) cells. Red shading indicates established Tpyl cell-TF axes, and 
dark blue or red nodes indicate TFs with >12 network interactions and logs 
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fold changes (logFC) > +0.5. Line width indicates the STRING interaction score. 
(E) Standardized logFC in gene expression for public signatures of WT versus 
Hobit-KO (10) or Maf-cKO (28) immune cells against skin Tgyl7 versus Trl 
cells. Top 100 up-regulated (red) and down-regulated (blue) genes in WT 

are highlighted. The orange line indicates the least squares regression. 

(F) Enrichment plots of logFC between WT versus Hobit-KO (10) or Maf-cKO 
(28) cells for up-regulated (red) and down-regulated (blue) genes in the Try core 
signature (10). (G) Experimental schematic. BM chimeras containing Hobit-KO 
cells, Maf-cKO cells, or WT control cells were associated with S. epidermidis and 
Toirc or skin Try cells were analyzed by scCITE-seq with TCR. (H) Single-cell 


(J) in skin Try cells. (K) UMAP of skin Try cells colored by merged cluster. 
(L) Distribution of top 30 expanded CD8* Tey clones (individual bars) across 
clusters (bar colors). Clonotype origin is indicated beneath. (M) Frequency 


of WT and 


KO skin Try cells per combined cluster. (N) Differen 


tially expressed 


genes between WT and KO skin Ty cells. Colored points indicate average 
logsFC > 0.125 and adjusted P < 0.05 up in WT (gray), Hobit-KO 


Maf-cKO ( 


blue). (0) Enrichment score for gene signature conta 


(red), or 
ining the top 


500 genes down-regulated in skin Try versus Tcipc Cells (10). Data are pooled 
from or represent two independent experiments per KO with n 


[(G) to (O 


| mice per group. Statistical comparisons were perfor 


= 30 to 60 
med with test of 


RNA sequencing (scRNA-seq) uniform manifold approximation and projection 
(UMAP) of skin Try cells colored by genotype (left) or unbiased clusters (right). 
(I and J) Expression of proteins detected by CITE-seq (I) or indicated genes 


in Tpml7 cells (e.g., Cer6, IlI7a, and Rorc). 
Hobit, Blimp1, and c-Maf shared the ability to 
promote genes up-regulated in Tpyy, cells (e.g., 
Cxcr6 and Pdcd1) and down-regulate genes 
that are shut down in Try cells compared with 
Terrc cells (10) (e.g., Cer7, KUf2, Sipri, and Tcf1) 
(Fig. 3E and fig. SSE), with a parallel up- 
regulation of genes down-regulated in both 
Trm cells and Hobit- and Maf-KO cells compared 
with WT counterparts (Fig. 3F and fig. S8F). 
Moreover, comparative analyses interrogating 
the proportion of the Try, cell gene signature 
(0) regulated by Hobit, Blimp1, or c-Maf high- 
lighted that multiple genes down-regulated in 
Tro Cells (10, 19) fell at the predicted inter- 
secting regulatory footprint of all three TFs 
(fig. S8G). 

These analyses implied that c-Maf and Hobit 
may play analogous roles in residency pro- 
gramming. To further examine their func- 
tion, we performed single-cell cellular indexing 
of transcriptomes and epitopes by sequencing 
(scCITE-seq) with TCR profiling on skin CD8* 
Tro Cells and Tyre cells lacking Hobit or Maf 
(Fig. 3G). Naive T cells, splenic Tcypc cells, and 
skin Tp» cells formed clusters that aligned 
with their tissue of origin (fig. S9, A to D), with 
Hobit-KO, Maf-cKO, and WT skin Tp» cells 
distributed across 10 clusters. Five of these clus- 
ters expressed CD49a, and two expressed CCR6 
(Fig. 3, H and I). Both CCR6™CD49a"° pop- 
ulations expressed Maf and were enriched 
for the skin-Tc17-cell gene signature, whereas 
CD49a"CCR6" cells formed five subpopula- 
tions enriched for Tpy1 cell-defining genes 
and a Tcl-cell gene set (Fig. 3J and fig. S9, E to G). 
We merged these clusters into Tal and Tay17 
cell groupings that enriched for either Tcl or Tc17 
signatures and affirmed their acquisition of the 
residency program (Fig. 3K and fig. S9, H to J). 
Although some unique TCR clones resided ex- 
clusively in either population, others distributed 
across both clusters, which suggests that the 
same T cell can commit to divergent Try, cell 
fates (Fig. 3L and fig. S9K). 

Finally, we compared Tpyyl cells from Hobit- 
KO or Trl’ cells from Maf-cKO with their WT 
counterparts. We noted a decrease in the rel- 
ative proportion of Tpy17 cells and a reciprocal 
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increase in Tpyl cells in Maf-cKO compared 
with WT cells, with the reverse observed in 
Hobit-KO cells (Fig. 3M and fig. S10, A and B). 
Maf-deficient Tpy17 cells were enriched for 
Tcl genes, and Hobit-deficient Tm cells were 
enriched for Tc17 genes (fig. S10C), which sug- 
gests skewing toward alternate phenotypes in 
the absence of either TF. Hobit-KO Ty cells 
and Maf-cKO Tpmil7 cells were enriched for 
cell death pathways and negatively enriched 
for pathways promoting skin Try cell devel- 
opment, including TGF-8 signaling and cell 
adhesion (fig. S10D). Hobit-KO Tpyl cells dis- 
played reduced expression of lineage-specific 
effector genes (Ggmb and Ccl5) and adhesion 
factors (igb7) compared with WT Try cells. By 
contrast, Maf-cKO Tpw17 cells down-regulated 
genes specifying Tpy17 cell fate (Ccer6, Rorc, 
and //23r) and genes implicated in wound 
healing (Furin and Sdc4) and the survival of this 
subset (//7r) (Fig. 3N). Genes down-regulated 
in skin Try cells compared with Teyrc cells (20) 
were also enriched in c-Maf-cKO Tpy17 versus 
WT cells (Fig. 30), which supports the notion 
that c-Maf promotes commitment to skin resi- 
dency. Thus, although Hobit and c-Maf enforce 
either Tyl or Tawl7 cell functional specifica- 
tion, they share an overlapping ability to drive 
residency programming. 


Trm7 cells engage an ICOS-c-Maf-IL-7 axis 
that enables subset-selective targeting 


Consistent with the notion that c-Maf selec- 
tively regulates Ty17 cell programming, c-Maf 
deficiency led to a clear decrease in skin CD8* 
Trml7 cell frequencies after S. epidermidis 
association without reducing Tpyl cell fre- 
quencies (Fig. 4, A and B, and fig. S11, A to D). 
Impaired IL-17A production by Maf-cKO skin 
Trm cell precursors was observed 8 days post- 
association (fig. S11, B and E). c-Maf-deficient 
Trml7 cells (but not Trl cells) displayed 
reduced IL-7Ra expression (fig. SIF), which 
indicates that c-Maf may support Tpy17 sur- 
vival by facilitating IL-7 signaling. c-Maf has 
also been shown to inhibit T cell factor 1 (TCF-1) 
activity, which stabilizes expression of T}17 
lineage-defining genes (28, 29). TCF-1 facilitates 
Tcirc cell development (30, 37), whereas the 
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association of profiles (E), the camera method (F), or a two-sided Wilcoxon 
signed rank test (0). Tabulated raw gene expression information used to 
generate the figures can be accessed at Dryad (47). 


loss of TCF-1 augmented both skin Tpyl and 
Trml7 cell formation (fig. S11, G and H). Thus, 
c-Maf acts as a central regulator of skin CD8* 
Trml7 cell development, which is essential for 
driving effector identity and tissue-residency 
programming in this subset. . 
We therefore reasoned that manipulating 
c-Maf and its regulatory partners might enable 
selective targeting of Tay17 cells. Inducible T cell 
costimulator (ICOS) is up-regulated in Ty, cells 
compared with Tcyrc cells (19) and can modulate 
c-Maf expression in vitro (32). Accordingly, 
c-Maf was diminished in IL-17A* CD8* T cells 
from ICOS-deficient mice (fig. SID, and ICOS 
was up-regulated by c-Maf but not by Hobit or 
Blimp! (Fig. 3E and fig. S8E). Ty417 cells from 
both S. epidermidis-associated mice (fig. S11, 
J and K) and human skin (fig. S11, J and L) 
expressed elevated ICOS, c-Maf, and CD127 
compared with Tpy]l cells, which suggested 
that this regulatory axis is conserved across 
species. Moreover, ICOS and CD127 gradually 
increased in skin Tpy17 cells, whereas c-Maf 
was maximally expressed 1 week after com- 
mensal exposure (fig. SIIM), consistent with the 
notion that c-Maf can induce both factors. 
Similarly to c-Maf deletion, ICOS-deficient 
mice displayed a selective reduction in Tay417 
but not Try cells after S. epidermidis asso- 
ciation (Fig. 4, C and D, and fig. S11, N to P). 
We therefore investigated whether target- 
ing ICOS would enable selective ablation of 
established Tpy17 cells. Blockade of ICOS 
signaling >30 days after S. epidermidis asso- 
ciation led to a selective reduction in Tpyl7 cells 
without affecting skin Tpyyl cells (Fig. 4, E and F, 
and fig. S11, Q and R). To determine whether 
altering the balance of Tay and Tpyl7 cells by 
targeting the ICOS-c-Maf-IL-7 axis modulates 
local immunity, we used an assay in which the 
speed of wound healing in the skin depends on 
CD8* Tpyl7 function (7). Loss of Ty17 cells in 
Maf-deficient mice led to significantly deceler- 
ated wound healing, as evidenced by slowed epi- 
dermal tongue elongation (Fig. 4, G to I, which 
demonstrates the functional consequences 
of Trml7 cell ablation. Thus, targeting spe- 
cific nodes in the developmental networks 
controlling either Tpy1 or Tay17 cells enables 
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Fig. 4. Targeting the ICOS—c-Maf-IL-7 axis permits selective ablation of skin Tpy17 cells. (A and 

B) Ratio of total Taw, Tal, and Trwl7 Maf-cKO cells compared with WT cells in mixed BM chimeras 2 weeks 
afer S. epidermidis association (A) and frequency of CD8* Try cells producing IFN-y or IL-17A (B). (C and 

D) Number (C) and frequency (D) of CD8* Tal and Teyl7 cells in the skin of Icos’’~ or WT mice >2 weeks 
postassociation. (E and F) Mice were treated daily with anti-ICOS antibody or PBS for 1 week commencing 
>4 weeks after S. epidermidis association. Shown is the number (E) and frequency (F) of CD8* Try cell subsets 

1 week after treatment. (G to I) Two weeks after mice were associated with S. epidermidis, a 6-mm punch biopsy was 
taken from the flank, and skin was analyzed 5 days later (G). Immunofluorescence images (H) and quantification 
of epidermal tongue length (green) representing basal keratinocyte-mediated re-epithelialization of skin wounds 

(I) are shown 5 days after the punch biopsy. Keratin 14 (basal epidermal keratinocytes) is green, and 4',6-diamidino-2- 


phenylindole (DAPI) is blue. Scale bar in (H), 1 mm. Data are pooled from two [(C) to (F)] or three [(A) and 


(B)] independent experiments with n = 9 to 12 [(A) and (B)], n = 10 [(C) to (F)], or n = 8 to 10 [(H) and (I)] mice 
per group. Skin Try cells were enumerated in flank skin [(A), (C), and (E)]. *P < 0.05; **P < 0.01; ****P < 0.0001; 
n.s., not significant; Mann-Whitney U test. Bars represent means, and symbols represent individual mice. 


selective eradication of skin Ty, cell subsets 
that differentially contribute to immunity. 


Discussion 


Although CD8* Tp cells share a common 
transcriptional foundation, recent work has 
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highlighted their extensive heterogeneity 
(12, 13, 17, 33-36). In this work, we find that 
separate signaling axes govern skin Tpyl and 
Trml7 cell commitment and survival. We show 
that Taw cells alone are programmed by the 
canonical T-bet-Hobit-IL-15 axis (JO, 11) and 
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perturbing factors required by either subset 
enables selective targeting in a manner that 
could be exploited for therapeutic gain. 

Tro cells are long lived and difficult to 
displace (37, 38), persisting in resolved skin 
pathologies, where they correlate with symp- 
tomatic relapse (16, 17, 39, 40). Strategies that 
can remove problematic Try cells and spare 
their protective counterparts are attractive, 
but the feasibility of such approaches was pre- 
viously unclear. In this work, we demonstrate 
that subset-selective targeting of Ty cells can 
be achieved by modulating factors central to 
either Tay or Tawl7 cells but expendable for 
the opposing subset. Blocking IL-15 signaling 
was shown to cure vitiligo by depleting skin 
Tro cells (78). Our results suggest that in- 
hibition of this pathway may not be oppor- 
tune to treat Tpy17-driven disorders, such as 
psoriasis (13), where inhibiting factors such 
as IL-7 or ICOS may prove more suitable. 

Although their dependencies were largely 
distinct, Tay] and Tpy17 cells displayed mutual 
requirements for factors such as TGF-. Capitaliz- 
ing on divergent and intersecting nodes in the 
developmental trajectories of Tyy1 and Tayy17 
cells may permit either global or selective 
boosting or ablation. Consistent with prior 
studies (7), we found that both subsets co- 
localized in the epidermidis, but distinct inter- 
actions with accessory cells may influence 
Trml versus Tpy17 cell commitment (14, 41). 
IFN-y*IL-17A* CD8* Tp cells can be observed 
in diseased patient skin (3) but are rare in 
mice. This distinction and variability in IL-17A 
production between individuals may reflect 
differences in microbial exposure. Whether 
these IFN-y*IL-17A* skin Tay cells use both 
Hobit- and c-Maf-centered axes or rely more 
heavily on one or the other remains unclear. 

Both Hobit and Runx3 coordinate cytotoxic 
programming in addition to tissue retention 
(11, 42-45). They may therefore be incompatible 
with Tpy17 cell commitment because their 
down-regulation may be necessary to hardwire 
Ty17 gene expression (46). Hobit-deficient Taw 
cells were enriched for Tpyy17 genes, whereas 
the reverse was true in MafcKO Tpyl7 cells. 
Trl7 cells may therefore be forced to engage 
alternate molecular circuitry to establish residency 
without compromising functional identity. Just 
as T-bet maintains IL-15RB in Tpyl cells (23), 
we found that c-Maf promotes IL-7Ro. expres- 
sion in Tpml7 cells, which implies that both TFs 
license homeostatic cytokine signaling. Consis- 
tent with overexpression studies in human CD4* 
T cells (32), we found that c-Maf promoted 
residency-associated molecules, including Cxcr6. 
However, we also determined that, similar to 
Hobit and Blimp1, c-Maf can repress genes 
silenced during Tp), cell commitment (e.g., KJ2 
and Sipr1), implying functional redundancy. We 
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therefore propose that c-Maf acts as an over- 
arching regulator of Taw17 biology, essential to 
(i) stabilize T;;17 family genes, (ii) enforce tissue 
residency, and (iii) facilitate cytokine-dependent 
survival. Thus, Tpy1 and Taw17 cells use distinct 
mechanisms to converge on a skin-resident fate, 
and targeting factors specific to either subset 
enables deliberate modulation of the Tp cell 
pool, unlocking avenues to fine-tune local im- 
munity in varied disease settings. 
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WORKING LIFE 


By Stephanie Sisley 


1082 


Blurring boundaries 


sobbed alone in the doctor’s office. Just 4 weeks prior to my daughter’s due date, after a completely 
healthy checkup 3 weeks earlier, the ultrasound showed no heartbeat. I composed myself and made 
it back to my car, but as I began to cry again I realized I could not drive myself home. My husband 
was traveling for work, so I called a friend. She dropped everything, found me in the parking garage, 
drove me home, and stayed with me until my sister arrived. Before she came to rescue me, she also 
told my clinical fellowship director what was happening. She was not only a close friend, but a col- 
league as well—an integration of professional and personal life that I have embraced. 


When I was starting out as a doc- 
tor, I avoided such blurring of per- 
sonal and professional boundaries. 
Thad a notion that, to be happy and 
achieve that ever-elusive “work-life 
balance,’ I needed to draw a line 
between work and personal life. So 
I maintained professional relation- 
ships with my colleagues but did 
not seek out deeper friendships 
with them. I typically declined in- 
vitations to activities after hours 
and maintained my “real friends” 
outside of work. I also felt upset if I 
needed to do work on “my time” and 
guilty if I had to take “work time” 
for personal activities. 

But my mindset started to change 
during my fellowship. I fell in love 
with research, and completing ex- 
periments and writing grants re- 
quired a more fluid notion of “work 
time.” I also began to develop closer relationships with 
colleagues. Perhaps because we shared an office or be- 
cause my stress level fell as I became more professionally 
confident, two of my co-fellows became true friends. We 
bonded over work, food, Hallmark movies, and snarky hu- 
mor. When that horrible day came, calling one of them 
was the obvious choice. 

While I sat in the parking garage waiting for my friend, 
I called one of the lab’s technicians to tell her I wouldn’t be 
able to help with an experiment the next day. She promptly 
told me not to worry about it and conveyed the informa- 
tion to my lab mentors, who in turn offered to help in any 
way they could and reassured me that someone would take 
care of my research animals for as long as I needed. I am so 
grateful for that kindness, which meant I didn’t need to take 
the initiative to get help. In some ways, it would have been 
easier to just keep working than to have to retell the experi- 
ence over and over to arrange coverage for my absence. I 
was barely able to speak over the phone and couldn’t imag- 


“Integrating work and life 
has provided a rich tapestry 
of relationships.” 


ine putting any part of the experi- 
ence down in an email. And despite 
the grave circumstances, I would 
have felt guilty asking for the help I 
clearly needed. 

My husband and I were fortunate 
to have a strong support network 
outside work. Many family mem- 
bers drove or flew in to be with 
us and in a sense got to meet our 
daughter, which has made her a 
part of the family even though she 
didn’t live. But it was my work col- 
leagues who stepped in and graced 
me with the luxury of grieving with- 
out worry of missed responsibili- 
ties. They delivered dinners to my 
door. They sent cards and stepped 
in to cover the gaps I had left with 
my sudden absence. They contin- 
ued to provide emotional support 
for weeks and months after the loss. 

This experience erased any remaining impulse to main- 
tain boundaries between my work and personal life. In the 
years since, through the more routine struggles and disap- 
pointments of life—rejected grant applications and papers, 
managing my work pursuits while parenting the two kids I 
have been blessed with—my work colleagues have bolstered 
my spirits with many lunches and supportive text messages. 
I keep pretty typical office hours, but I don’t get stressed 
anymore when working on a grant bleeds into the weekend 
or I have to leave early for a child’s game or recital; I just 
see it all as part of life. 

I hear many incoming fellows say work-life balance is a 
priority for them, and I understand where they are coming 
from. But for me, integrating work and life has provided a 
rich tapestry of relationships and the ability to enjoy my 
time—and that has made all the difference. 


Stephanie Sisley is an assistant professor at Baylor College of Medicine. 
Send your career story to SciCareerEditor@aaas.org. 
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») Non-invasive Prenatal Testing 
Workflow 

_ The international molecular 
diagnostics group Yourgene Health 
has installed the first non-invasive 
prenatal testing workflow in 
Morocco based on the IONA test 

at the Centre de Biologie Riad 
(Laboriad). Yourgene’s IONA test is an advanced prenatal screening 
test that estimates the risk of three fetal genetic aberrations: 
trisomy 21 (Down's syndrome), trisomy 18 (Edwards’ syndrome), 
and trisomy 13 (Patau’s syndrome). The test can also be used 

to determine the sex of the fetus. This test is a complete CE-IVD 
product for laboratories aiming to perform in-house prenatal 
testing services that are less invasive than other methods. The 
analysis is performed on a maternal blood sample, with test results 
available in three days. The test is suitable for low- to high-volume 
sample throughput, enabling clinical laboratories to meet and grow 
with their rising demands. By offering this service locally, Laboriad 
can provide pregnant women with speedy, reliable results while 
reducing the need for invasive tests and the associated stress for 
expecting parents. 

Yourgene Health 

For info: +44 7587-140199 

https://yourgenehealth.com 


Aptamers for Cancer Diagnosis 

AMSBIO is offering new diagnostic products based on aptamers, 
synthetic sequences of genetic material like DNA and RNA that can 
target many human proteins linked to cancer. Aptamers bind to 
target molecules with the same affinity and specificity as antibodies, 
but offer unique advantages such as a lack of immunogenicity. This 
makes them an effective tool for cancer diagnosis and therapy. There 
are two primary applications for aptamers in cancer therapy: use as 
antagonists by targeting and inhibiting cancer-specific molecules, 
and use as delivery vehicles for therapeutics. By specifically targeting 
cell membrane receptors on cancer cells, aptamer-drug conjugates 
bind to cancer cells and penetrate them, without harming healthy 
cells. For diagnostics, aptamers can be used to identify cancer cells, 
recognize cancer biomarkers and metabolites, and differentiate cells. 
AMSBIO's new aptamer-based technology Cell-SELEX is now being 
used to screen for aptamers that bind to cancer cell-surface protein 
biomarkers. This helps overcome the traditional shortfalls of mass 
spectrometry and antibody-based methods, like cross-reactivity, 
poor reproducibility, complexity, and high cost. 

AMSBIO 

For info: +1-617-945-5033 
https://www.amsbio.com/aptamers-for-cancer/ 


Antifibrotic Drug Testing Platform 

Xylyx Bio now offers services to evaluate antifibrotic drug 

candidates with their new IN MATRICO Fibrosis Platform. The 

cellular microenvironment is one of the defining characteristics of 
fibrotic diseases, and the company’s fibrosis disease models include 
primary human cells cultured in primary tissue-specific human 
extracellular matrices (ECMs). The product enables predictive 

drug testing, and Xylyx hopes it will help find drugs to treat non- 
alcoholic steatohepatitis (NASH). A major obstacle to developing 
effective NASH treatments is the lack of predictive models for human 
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accurately represent the disease’s histopathology or metabolic 
profile. In-vitro NASH models have limited physiologic relevance 
and fail to provide the biochemical, structural, and mechanical 
environment in fibrotic human liver. IN MATRICO overcomes many 
of these limitations. Other advantages of using the IN MATRICO® 
Platform include access to high-fidelity 3D disease models, enabling 
more physiologic cell-matrix interactions, and generating clinically- 
relevant and reproducible results. 

Xylyx Bio 

For info: +1 (212) 689-9005 

www.xylyxbio.com 


Comprehensive Tumor Analysis Antibodies 

BioGenex has released a wide-ranging family of antibodies for the 
detection and analysis of human sarcoma cancers. This includes 
several immunohistochemistry (IHC) panels with applications that 
include initial differentiation, tumor origin, treatment methods, and 
prognosis. All antibodies sold have been validated on human tissues 

to ensure sensitivity and specificity, and are therefore fit for use in © 
diagnostic and reference laboratories. BioGenex comprehensive IHC 
panels include a range of mouse monoclonal, rabbit monoclonal, and 
polyclonal antibodies to choose from. The company has expanded its 
antibody product line, which comes in both concentrated and ready- 
to-use formats, functional in both manual and automated systems. 
The sarcoma tumor antibodies available from Biogenix include CD31, 
CD34, DOG1, EMA, ERG, S100 protein, SMA, Desmin, CD40, FLI-1, < 
TLE-1, PAX-3, STAT-6, TFE3, SDHB, NKX2.2, NKX3.1, WT1, MUC4, CK 

Pan, Vimentin, Kappa, Lambda, CD45RO, CD45, CK7, VEGF, CA125, 
PDGFRB, CD4, and CD16. 

BioGenex ‘ 
For info: +1 510-824-1400 

www.biogenex.com 


INTEGRA Biosciences’ pipetting solutions 

INTEGRA Biosciences’ pipetting platforms are being used to 
streamline the liquid handling steps involved in antimicrobial 
susceptibility testing (AST), contributing to the early detection 

of antimicrobial resistance (AMR). Treatment requires rapid 
identification of AMR, and molecular techniques are a popular 

AST method. However, being much slower and error-prone, the 
numerous manual pipetting steps required can delay lab results 
and treatment. INTEGRA's pipetting platforms can simplify liquid 
handling workflows. For example, the ASSIST PLUS pipetting robot 
and VOYAGER adjustable tip spacing pipette offer automatic liquid 
transfer and mixing, while the MINI 96 portable electronic pipette 
can be used for fast sample extraction and qPCR set-up by adding 
96 samples or reagents in parallel. The ASSIST PLUS is also capable 
of automating the nucleic acid extraction process, enabling users 
to obtain consistently high quality DNA material from samples. The 
VIAFLO 96 handheld electronic pipette is perfect for the hands-free 
purification of PCR products for sequencing, while the ASSIST PLUS 
and D-ONE single channel pipetting module can be used to perform 
effortless DNA library normalization, as well as serial dilutions and 
hit picking. These robust and ergonomic products help to minimize 
human errors and lead to consistent, accurate and reliable results, 
as well as a faster lab workflow and higher productivity. 

INTEGRA Biosciences 

For info: +44 (0)1480 405333 

https://www.integra-biosciences.com 
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